Abstract:
The embodiments of the present disclosure disclose a method for constructing a speech decoding network in digital speech recognition. The method comprises acquiring training data obtained by digital speech recording, the training data comprising a plurality of speech segments, and each speech segment comprising a plurality of digital speeches; performing acoustic feature extraction on the training data to obtain a feature sequence corresponding to each speech segment; performing progressive training starting from a mono-phoneme acoustic model to obtain an acoustic model; acquiring a language model, and constructing a speech decoding network by the language model and the acoustic model obtained by training.
Abstract:
Processing circuitry of an information processing apparatus obtains a set of identity vectors that are calculated according to voice samples from speakers. The identity vectors are classified into speaker classes respectively corresponding to the speakers. The processing circuitry selects, from the identity vectors, first subsets of interclass neighboring identity vectors respectively corresponding to the identity vectors and second subsets of intraclass neighboring identity vectors respectively corresponding to the identity vectors. The processing circuitry determines an interclass difference based on the first subsets of interclass neighboring identity vectors and the corresponding identity vectors; and determines an intraclass difference based on the second subsets of intraclass neighboring identify vectors and the corresponding identity vectors. Further, the processing circuitry determines a set of basis vectors to maximize a projection of the interclass difference on the basis vectors and to minimize a projection of the intraclass difference on the basis vectors.
Abstract:
The present disclosure is applicable to an information processing field. A method and system for sharing an image-editing action are provided. In the method, a sharing platform receives and stores image-editing action information from each transmitting party, a requesting party transmits an image-editing action information sharing request to the sharing platform, the sharing platform searches for the image-editing action information corresponding to the sharing request, and transmits the image-editing action information searched out to the requesting party, and the requesting party edits a selected image using the image-editing action information returned by the sharing platform. According to the disclosure, the image-editing action information can be directly invoked or shared, thus effectively reducing repetitive image-editing operations for users and improving image-editing efficiency.
Abstract:
An image gaze correction method, apparatus, electronic device, computer-readable storage medium, and computer program product related to the field of artificial intelligence technologies are provided. The image gaze correction method includes: acquiring an eye image from an image; performing feature extraction processing on the eye image to obtain feature information of the eye image; performing, based on the feature information and a target gaze direction, gaze correction processing on the eye image to obtain an initially corrected eye image and an eye contour mask; performing, by using the eye contour mask, adjustment processing on the initially corrected eye image to obtain a corrected eye image; and generating a gaze corrected image based on the corrected eye image.
Abstract:
An action recognition method includes: obtaining original feature submaps of each of temporal frames on a plurality of convolutional channels by using a multi-channel convolutional layer; calculating, by using each of the temporal frames as a target temporal frame, motion information weights of the target temporal frame on the convolutional channels according to original feature submaps of the target temporal frame and original feature submaps of a next temporal frame, and obtaining motion information feature maps of the target temporal frame on the convolutional channels according to the motion information weights; performing temporal convolution on the motion information feature maps of the target temporal frame to obtain temporal motion feature maps of the target temporal frame; and recognizing an action type of a moving object in image data of the target temporal frame according to the temporal motion feature maps of the target temporal frame on the convolutional channels.
Abstract:
This present disclosure describes a video image processing method and apparatus, a computer-readable medium and an electronic device, relating to the field of image processing technologies. The method includes determining, by a device, a target-object region in a current frame in a video. The device includes a memory storing instructions and a processor in communication with the memory. The method also includes determining, by the device, a target-object tracking image in a next frame and corresponding to the target-object region; and sequentially performing, by the device, a plurality of sets of convolution processing on the target-object tracking image to determine a target-object region in the next frame. A quantity of convolutions of a first set of convolution processing in the plurality of sets of convolution processing is less than a quantity of convolutions of any other set of convolution processing.
Abstract:
The present disclosure discloses an image recognition method and apparatus, and belongs to the field of computer technologies. The method includes: extracting a local binary pattern (LBP) feature vector of a target image; calculating a high-dimensional feature vector of the target image according to the LBP feature vector; obtaining a training matrix, the training matrix being a matrix obtained by training images in an image library by using a joint Bayesian algorithm; and recognizing the target image according to the high-dimensional feature vector of the target image and the training matrix. The image recognition method and apparatus according to the present disclosure may combine LBP algorithm with a joint Bayesian algorithm to perform recognition, thereby improving the accuracy of image recognition.
Abstract:
An identity verification method performed at a terminal includes: displaying and/or playing in an audio form action guide information selected from a preset action guide information library, and collecting a corresponding set of action images within a preset time window; performing matching detection on the collected set of action images and the action guide information, to obtain a living body detection result indicating whether a living body exists in the collected set of action images; according to the living body detection result that indicates that a living body exists in the collected set of action images: collecting user identity information and performing verification according to the collected user identity information, to obtain a user identity information verification result; and determining the identity verification result according to the user identity information verification result.
Abstract:
A 3D human face reconstruction method and apparatus, and a server are provided. In some embodiments, the method includes determining feature points on an acquired 2D human face image; determining posture parameters of a human face according to the feature points, and adjusting a posture of a universal 3D human face model according to the posture parameters; determining points on the universal 3D human face model corresponding to the feature points, and adjusting the corresponding points in a sheltered status to obtain a preliminary 3D human face model; and performing deformation adjusting on the preliminary 3D human face model, and performing texture mapping on the deformed 3D human face model to obtain a final 3D human face.
Abstract:
The present disclosure discloses an image recognition method and apparatus, and belongs to the field of computer technologies. The method includes: extracting a local binary pattern (LBP) feature vector of a target image; calculating a high-dimensional feature vector of the target image according to the LBP feature vector; obtaining a training matrix, the training matrix being a matrix obtained by training images in an image library by using a joint Bayesian algorithm; and recognizing the target image according to the high-dimensional feature vector of the target image and the training matrix. The image recognition method and apparatus according to the present disclosure may combine LBP algorithm with a joint Bayesian algorithm to perform recognition, thereby improving the accuracy of image recognition.