Abstract:
Avatar animation systems disclosed herein provide high quality, real-time avatar animation that is based on the varying countenance of a human face. In some example embodiments, the real-time provision of high quality avatar animation is enabled at least in part, by a multi-frame regressor that is configured to map information descriptive of facial expressions depicted in two or more images to information descriptive of a single avatar blend shape. The two or more images may be temporally sequential images. This multi-frame regressor implements a machine learning component that generates the high quality avatar animation from information descriptive of a subject's face and/or information descriptive of avatar animation frames previously generated by the multi-frame regressor. The machine learning component may be trained using a set of training images that depict human facial expressions and avatar animation authored by professional animators to reflect facial expressions depicted in the set of training images.
Abstract:
Examples of systems and methods for augmented facial animation are generally described herein. A method for mapping facial expressions to an alternative avatar expression may include capturing a series of images of a face, and detecting a sequence of facial expressions of the face from the series of images. The method may include determining an alternative avatar expression mapped to the sequence of facial expressions, and animating an avatar using the alternative avatar expression.
Abstract:
Techniques related to automatic perspective control of images using vanishing points are discussed. Such techniques may include determining a perspective control vanishing point associated with the image based on lines detected within the image, rotating the image based on the perspective control vanishing point to generate an aligned image, and warping the aligned image based on aligning two lines of the detected lines that meet at the perspective control vanishing point.
Abstract:
Techniques are provided for facial recognition using decoy-based matching of facial image features. An example method may include comparing extracted facial features of an input image, provided for recognition, to facial features of each of one or more images in a gallery of known faces, to select a closest gallery image. The method may also include calculating a first distance between the input image and the selected gallery image. The method may further include comparing the facial features of the input image to facial features of each of one or more images in a set of decoy faces, to select a closest decoy image and calculating a second distance between the input image and the selected decoy image. The method may further include recognizing a match between the input image and the selected gallery image based on a comparison of the first distance and the second distance.
Abstract:
Apparatuses, methods and storage medium associated with 3D face model reconstruction are disclosed herein. In embodiments, an apparatus may include a facial landmark detector, a model fitter and a model tracker. The facial landmark detector may be configured to detect a plurality of landmarks of a face and their locations within each of a plurality of image frames. The model fitter may be configured to generate a 3D model of the face from a 3D model of a neutral face, in view of detected landmarks of the face and their locations within a first one of the plurality of image frames. The model tracker may be configured to maintain the 3D model to track the face in subsequent image frames, successively updating the 3D model in view of detected landmarks of the face and their locations within each of successive ones of the plurality of image frames. In embodiments, the facial landmark detector may include a face detector, an initial facial landmark detector, and one or more facial landmark detection linear regressors. Other embodiments may be described and/or claimed.
Abstract:
System, apparatus, method, and computer readable media for on-the-fly captured image data enhancement. An image or video stream is enhanced with a filter in concurrence with generation of the stream by a camera module. In one exemplary embodiment, HD image frames are filtered at a rate of 30 fps, or more, to enhance human skin tones with an edge-preserving smoothing filter. In embodiments, the smoothing filter is applied to an image representation of reduced resolution, reducing computational overhead of the filter. The filtered image is then upsampled and blended with a map that identifies edges to maintain an edge quality comparable to a smoothing filter applied at full resolution. A device platform including a camera module and comporting with the exemplary architecture may provide enhanced video camera functionality even at low image processing bandwidth.
Abstract:
There is provided a method of processing a digital image including: (a) obtaining a plurality of images; (b) converting the plurality of images into histograms; (c) setting one of the plurality of images as a reference image and another of the plurality of images as a comparison target image; (d) adjusting a distribution of the histogram of the reference image to match a distribution of the histogram of the comparison target image to produce an adjusted reference image; (e) comparing a difference between the adjusted reference image and the comparison target image to produce a masking image; (f) applying the masking image to the comparison target image to produce an adjusted comparison target image; and (g) combining the reference image and the adjusted comparison target image to produce a high dynamic range (HDR) image. Accordingly, even if there is a complex motion on a subject, a clear image without an image overlap or a ghost effect may be obtained when producing the HDR image.
Abstract:
According to a method for providing a notification on a face recognition environment of the present disclosure, the method includes obtaining an input image that is input in a preview state, comparing feature information for a face included in the input image with feature information for a plurality of reference images of people stored in a predetermined database to determine, in real-time, whether the input image satisfies a predetermined effective condition for photographing. The predetermined effective condition for photographing is information regarding a condition necessary for recognizing the face included in the input image at a higher accuracy level than a predetermined accuracy level. The method further includes providing a user with a predetermined feedback for photographing guidance that corresponds to whether the predetermined effective condition for photographing is satisfied. According to the method, a condition of a face image detected for face recognition is checked, and if there is an unsuitable element in recognizing the face, it is notified to a user such that an obstruction environment hindering the face recognition by the user is removed, for enhancing a success rate of the face recognition.
Abstract:
Systems, apparatus, articles of manufacture, and methods are disclosed to explain machine learning models. An example apparatus includes programmable circuitry to generate a bitmask having a kernel size, apply the bitmask to the input image to generate a first masked input image, and compute a first saliency score for a first pixel of an input image based on the first masked input image. Additionally, the example programmable circuitry is to compute a second saliency score for a second pixel of the input image based on an output generated by a machine learning model, the output based on a second masked input image. The example programmable circuitry is also to generate a saliency map for the input image based on the first saliency score and the second saliency score, the saliency map indicative of at least one feature of the input image that contributed to training of the machine learning model.
Abstract:
Avatar animation systems disclosed herein provide high quality, real-time avatar animation that is based on the varying countenance of a human face. In some example embodiments, the real-time provision of high quality avatar animation is enabled at least in part, by a multi-frame regressor that is configured to map information descriptive of facial expressions depicted in two or more images to information descriptive of a single avatar blend shape. The two or more images may be temporally sequential images. This multi-frame regressor implements a machine learning component that generates the high quality avatar animation from information descriptive of a subject's face and/or information descriptive of avatar animation frames previously generated by the multi-frame regressor. The machine learning component may be trained using a set of training images that depict human facial expressions and avatar animation authored by professional animators to reflect facial expressions depicted in the set of training images.