-
公开(公告)号:US11748895B2
公开(公告)日:2023-09-05
申请号:US17184379
申请日:2021-02-24
Inventor: Tianwei Lin , Xin Li , Fu Li , Dongliang He , Hao Sun , Henan Zhang
CPC classification number: G06T7/246 , G06F18/253 , G06N3/04 , G06V20/41 , G06V20/46
Abstract: A method and apparatus for processing a video frame are provided. The method may include: converting, using an optical flow generated based on a previous frame and a next frame of adjacent frames in a video, a feature map of the previous frame to obtain a converted feature map; determining, based on an error of the optical flow, a weight of the converted feature map, and obtaining a fused feature map based on a weighted result of a feature of the converted feature map and a feature of a feature map of the next frame; and updating the feature map of the next frame as the fused feature map.
-
公开(公告)号:US11735168B2
公开(公告)日:2023-08-22
申请号:US17209681
申请日:2021-03-23
Inventor: Xin Li , Bin Huang , Ce Zhang , Jinfeng Bai , Lei Jia
CPC classification number: G10L15/16 , G06N3/08 , G10L15/063 , G10L15/197 , G10L15/22 , G10L15/32 , G10L25/18 , G10L15/20 , G10L2015/0631
Abstract: A method and an apparatus for recognizing a voice are provided. The method may include: inputting a target voice into a pre-trained voice recognition model to obtain an initial text output by at least one recognition network in the voice recognition model, the recognition network including a plurality of preset types of processing layers, and at least one type of processing layer of the recognition network being obtained by training based on a voice sample in a preset direction interval; and determining a voice recognition result of the target voice, based on the initial text.
-
公开(公告)号:US20210350508A1
公开(公告)日:2021-11-11
申请号:US17382582
申请日:2021-07-22
Inventor: Xin Li , Fu Li , Tianwei Lin , Henan Zhang
Abstract: A meme generation method, an electronic device, and a storage medium are provided. The method includes: determining a plurality of second expression images corresponding to a target face image based on a plurality of first expression images contained in a first meme; generating a second meme corresponding to the target face image based on the plurality of second expression images corresponding to the target face image; wherein, determining an affine transformation parameter between the target face image and an i-th first expression image in the plurality of first expression images according to a corresponding relation between a face key point in the target face image and a face key point in the i-th first expression image; and transforming the target face image based on the affine transformation parameter to obtain an i-th second expression image corresponding to the target face image.
-
4.
公开(公告)号:US11087741B2
公开(公告)日:2021-08-10
申请号:US16254309
申请日:2019-01-22
Inventor: Jianwei Sun , Chao Li , Xin Li , Weixin Zhu , Ming Wen
IPC: G10L15/06 , G10L21/0264 , G10L25/78 , G10L21/0216 , G10L15/20 , G10L15/16 , G10L25/51
Abstract: Embodiments of the present disclosure include methods, apparatuses, devices, and computer readable storage mediums for processing far-field environmental noise. The method can comprise processing collected far-field environmental noise to a noise segment in a predetermined format. The method can further comprise establishing a far-field voice recognition model based on the noise segment and a near-field voice segment; and determining validity of the noise segment based on the far-field voice recognition model. The solution of the present disclosure can optimize anti-noise performance of the far-field voice recognition model by differentiated training of noise in different user scenarios of a far-field voice recognition product.
-
公开(公告)号:US20210210113A1
公开(公告)日:2021-07-08
申请号:US17208387
申请日:2021-03-22
Inventor: Xin Li , Bin Huang , Ce Zhang , Jinfeng Bai , Lei Jia
Abstract: The present disclosure provides a method and apparatus for detecting a voice, relates to the fields of voice processing and deep learning technology. The method may include: acquiring a target voice; and inputting the target voice into a pre-trained deep neural network to obtain whether the target voice has a sub-voice in each of a plurality of preset direction intervals, the deep neural network being used to predict whether the voice has a sub-voice in each of the plurality of direction intervals.
-
公开(公告)号:US10861480B2
公开(公告)日:2020-12-08
申请号:US16228656
申请日:2018-12-20
Inventor: Jianwei Sun , Chao Li , Xin Li , Weixin Zhu , Ming Wen
IPC: H04R3/00 , H04R29/00 , G10L21/0272 , G10L21/0208 , G10L21/0316 , G10L25/78 , G10L25/48 , G10L15/06 , G10L21/00 , G10L25/21
Abstract: Embodiments of the present disclosure provide a method and a device for generating far-field speech data, a computer device and a computer readable storage medium. The method includes obtaining environmental noise in real environment and adjusting near-field speech data in a near-field speech data set based on the environmental noise, further includes generating far-field speech data based on adjusted near-field speech data and the environmental noise.
-
公开(公告)号:US11741684B2
公开(公告)日:2023-08-29
申请号:US17332520
申请日:2021-05-27
Inventor: Hao Sun , Fu Li , Xin Li , Tianwei Lin
IPC: G06V10/20 , G06V10/25 , G06T7/11 , G06T7/90 , G06T3/60 , G06V40/16 , G06V10/75 , G06V10/772 , G06V20/20
CPC classification number: G06V10/25 , G06T3/60 , G06T7/11 , G06T7/90 , G06V10/758 , G06V10/772 , G06V20/20 , G06V40/162 , G06T2207/30201
Abstract: The disclosure provides an image processing method, an image processing apparatus, an electronic device and a storage medium, which belongs to the field of computer technologies, and specifically relates to computing vision, image processing, face recognition, and deep learning technologies in artificial intelligence. The method includes: performing skin color recognition on a face image to be processed to determine a target skin color of a face contained in the face image; obtaining a reference transformation image corresponding to the face image by processing the face image using any style transfer model in response that a style transfer model set does not comprise a style transfer model corresponding to the target skin color; and obtaining a target transformation image matching the target skin color by adjusting a hue value, a saturation value, and a lightness value of each pixel in the target region based on the target skin color.
-
公开(公告)号:US11463631B2
公开(公告)日:2022-10-04
申请号:US17025255
申请日:2020-09-18
Inventor: Henan Zhang , Xin Li , Fu Li , Tianwei Lin , Hao Sun , Shilei Wen , Hongwu Zhang , Errui Ding
Abstract: Embodiments of the present disclosure provide a method and apparatus for generating an image. The method may include: receiving a first image including a face input by a user in an interactive scene; presenting the first image to the user; inputting the first image into a pre-trained generative adversarial network in a backend to obtain a second image output by the generative adversarial network; where the generative adversarial network uses face attribute information generated based on the input image as a constraint; and presenting the second image to the user in response to obtaining the second image output by the generative adversarial network in the backend.
-
9.
公开(公告)号:US20210312686A1
公开(公告)日:2021-10-07
申请号:US17350449
申请日:2021-06-17
Inventor: Tianwei Lin , Fu Li , Xiaoqing Ye , Henan Zhang , Xin Li
Abstract: The present disclosure discloses a method and apparatus for generating a human body three-dimensional model, a device and a storage medium. The method may include: receiving a single human body image, and extracting an SMPL human body three-dimensional model corresponding to the human body image and a PIFu human body three-dimensional model corresponding to the human body image; matching the SMPL human body three-dimensional model with the PIFu human body three-dimensional model to obtain a matching result; determining a vertex of the SMPL human body three-dimensional model closest to a vertex of the PIFu human body three-dimensional model based on the matching result to obtain a binding weight of the vertex of the PIFu human body three-dimensional model and each skeleton point of the SMPL human body three-dimensional model; and outputting a drivable human body three-dimensional model.
-
10.
公开(公告)号:US11600069B2
公开(公告)日:2023-03-07
申请号:US17144205
申请日:2021-01-08
Inventor: Tianwei Lin , Xin Li , Dongliang He , Fu Li , Hao Sun , Shilei Wen , Errui Ding
Abstract: A method and apparatus for detecting a temporal action of a video, an electronic device and a storage medium are disclosed, which relates to the field of video processing technologies. An implementation includes: acquiring an initial temporal feature sequence of a video to be detected; acquiring, by a pre-trained video-temporal-action detecting module, implicit features and explicit features of a plurality of configured temporal anchor boxes based on the initial temporal feature sequence; and acquiring, by the video-temporal-action detecting module, the starting position and the ending position of a video clip containing a specified action, the category of the specified action and the probability that the specified action belongs to the category from the plural temporal anchor boxes according to the explicit features and the implicit features of the plural temporal anchor boxes.
-
-
-
-
-
-
-
-
-