Abstract:
An apparatus for generating a 3-dimensional face model includes a multi-view image capturer configured to sense a motion of the mobile device and automatically capture still images from two or more directions; and a 3D model generator configured to generate a 3D face mode using the two or more still images obtained by the multi-view image capturer.
Abstract:
Disclosed is a method of reconstructing a three-dimensional color mesh and an apparatus for the same. According to an embodiment of the present disclosure, the method includes: receiving mesh information of an object, multiple multi-view images obtained by photographing the object at different positions, and camera parameter information corresponding to the multiple multi-view images; constructing a texture map with respect to the object on the basis of the received information and setting a texture patch referring to a color value of the same multi-view image; correcting a color value of a vertex included for each texture patch; and performing rendering with respect to the object by applying the corrected color value of the vertex to the texture map.
Abstract:
A method of converting a two-dimensional video to a three-dimensional video, the method comprising: comparing an image of an nth frame with an accumulated image until an n−1th frame in the two-dimensional video to calculate a difference in a color value for each pixel; generating a difference image including information on a change in a color value for each pixel of the nth frame; storing an accumulated image until the nth frame by accumulating the information on the change in the color value for each pixel until the nth frame; performing an operation for a pixel in which a change in a color value is equal to or larger than a predetermined level by using the difference image to generate a division image and a depth map image; and converting the image of the nth frame to a three-dimensional image by using the depth map image.
Abstract:
A neural network device for learning dependency of feature data includes: a memory in which at least one program is stored; and a processor that performs a calculation by executing the at least one program, in which the processor is configured to acquire graph information including a data node for a human body; extract feature data corresponding to a plurality of joints constituting the human body from the graph information; acquire a self-attention output corresponding to the feature data based on a self-attention mechanism; and generate result data for a motion of the human body based on the self-attention output, and the self-attention output includes position information acquired based on positional encoding of the feature data and structural information acquired based on geodesic encoding of the feature data.
Abstract:
A method and apparatus for generating a face-harmonized image are disclosed. The method of generating a face-harmonized image includes receiving an input image, extracting facial landmarks from a target image and the input image, generating a face-removed image of the target image based on a facial mask region, extracting a user face image from the input image, transforming the user face image to correspond to the facial mask region, generating a face-blended image by blending the transformed user face image with the target image, extracting a feature map of the face-blended image, generating a combined feature map based on the feature map of the face-blended image and a feature map of the target image, generating a face harmonization result image based on the combined feature map, and providing the generated face harmonization result image.
Abstract:
Provided are a system and method for generating an image implemented in the same style as a style of a single image sample based on a result of learning the single image sample, the system including: an image learning unit configured to train a deep learning network model based on an image sample of a specific style to generate a same style generative model network; and an image generating unit configured to input a layout image to the same style generative model network to generate an image having the same style as the specific style image sample while conforming to a layout structure of the layout image.
Abstract:
Provided are a method and system for training a dynamic deep neural network. The method for training a dynamic deep neural network includes receiving an output of a last layer of the deep neural network and outputting a first loss, receiving an output of a routing module according to an input class of the deep neural network and outputting a second loss, calculating a third loss based on the first loss and the second loss, and updating a weight of the deep neural network by using the third loss.
Abstract:
An apparatus and method of producing stereoscopic subtitles by analyzing a three-dimensional (3D) space is disclosed, the apparatus including a camera position calculator to calculate a position of a first camera and a position of a second camera from a first image and a second image, respectively, a subtitle flat arranger to arrange a subtitle flat using a viewing direction of the first camera and a viewing direction of the second camera at the calculated positions, and a subtitle producer to produce subtitles using the subtitle flat.
Abstract:
Disclosed are an apparatus and a method for extracting a foreground layer from an image sequence that extract a foreground object layer area in which a depth value is discontinuous with that of a background from an input image sequence. By using the present disclosure, the layer area is automatically tracked in the subsequent frames through user's setting in the start frame in the image sequence in which the depth values of the foreground and the background are discontinuous, thereby extracting the foreground layer area in which the drift phenomenon and the flickering phenomenon are reduced.
Abstract:
A method of generating a moving viewpoint motion picture, which is performed by a processor that executes at least one instruction stored in a memory, may comprise: obtaining an input image; generating a trimap from the input image; generating a depth map using the input image; generating a foreground mesh/texture map model based on a foreground alpha map obtained based on the trimap and foreground depth information obtained based on the trimap and the depth map; and generating a moving viewpoint motion picture based on the foreground mesh/texture map model.