Patent search ap:("Beijing Baidu Netcom Science AND Technology Co. Page Ltd.") AND inv:"Xin Li"

1.

发明授权
Method and apparatus for processing video frame 有权

公开(公告)号：US11748895B2

公开(公告)日：2023-09-05

申请号：US17184379

申请日：2021-02-24

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Tianwei Lin , Xin Li , Fu Li , Dongliang He , Hao Sun , Henan Zhang

IPC: G06K9/00 , G06T7/246 , G06N3/04 , G06V20/40 , G06F18/25

CPC classification number: G06T7/246 , G06F18/253 , G06N3/04 , G06V20/41 , G06V20/46

Abstract: A method and apparatus for processing a video frame are provided. The method may include: converting, using an optical flow generated based on a previous frame and a next frame of adjacent frames in a video, a feature map of the previous frame to obtain a converted feature map; determining, based on an error of the optical flow, a weight of the converted feature map, and obtaining a fused feature map based on a weighted result of a feature of the converted feature map and a feature of a feature map of the next frame; and updating the feature map of the next frame as the fused feature map.

2.

发明授权
Method and apparatus for recognizing voice 有权

公开(公告)号：US11735168B2

公开(公告)日：2023-08-22

申请号：US17209681

申请日：2021-03-23

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xin Li , Bin Huang , Ce Zhang , Jinfeng Bai , Lei Jia

IPC: G10L15/16 , G06N3/08 , G10L15/06 , G10L15/197 , G10L15/22 , G10L15/32 , G10L25/18 , G10L15/20

CPC classification number: G10L15/16 , G06N3/08 , G10L15/063 , G10L15/197 , G10L15/22 , G10L15/32 , G10L25/18 , G10L15/20 , G10L2015/0631

Abstract: A method and an apparatus for recognizing a voice are provided. The method may include: inputting a target voice into a pre-trained voice recognition model to obtain an initial text output by at least one recognition network in the voice recognition model, the recognition network including a plurality of preset types of processing layers, and at least one type of processing layer of the recognition network being obtained by training based on a voice sample in a preset direction interval; and determining a voice recognition result of the target voice, based on the initial text.

3.

发明申请
MEME GENERATION METHOD, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210350508A1

公开(公告)日：2021-11-11

申请号：US17382582

申请日：2021-07-22

Applicant: Beijing Baidu Netcom Science and Technology Co., LTD

Inventor： Xin Li , Fu Li , Tianwei Lin , Henan Zhang

IPC: G06T5/00 , G06K9/00 , G06K9/62 , G06T3/00

Abstract: A meme generation method, an electronic device, and a storage medium are provided. The method includes: determining a plurality of second expression images corresponding to a target face image based on a plurality of first expression images contained in a first meme; generating a second meme corresponding to the target face image based on the plurality of second expression images corresponding to the target face image; wherein, determining an affine transformation parameter between the target face image and an i-th first expression image in the plurality of first expression images according to a corresponding relation between a face key point in the target face image and a face key point in the i-th first expression image; and transforming the target face image based on the affine transformation parameter to obtain an i-th second expression image corresponding to the target face image.

4.

发明授权
Method, apparatus, device and storage medium for processing far-field environmental noise 有权

公开(公告)号：US11087741B2

公开(公告)日：2021-08-10

申请号：US16254309

申请日：2019-01-22

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Jianwei Sun , Chao Li , Xin Li , Weixin Zhu , Ming Wen

IPC: G10L15/06 , G10L21/0264 , G10L25/78 , G10L21/0216 , G10L15/20 , G10L15/16 , G10L25/51

Abstract: Embodiments of the present disclosure include methods, apparatuses, devices, and computer readable storage mediums for processing far-field environmental noise. The method can comprise processing collected far-field environmental noise to a noise segment in a predetermined format. The method can further comprise establishing a far-field voice recognition model based on the noise segment and a near-field voice segment; and determining validity of the noise segment based on the far-field voice recognition model. The solution of the present disclosure can optimize anti-noise performance of the far-field voice recognition model by differentiated training of noise in different user scenarios of a far-field voice recognition product.

5.

发明申请
METHOD AND APPARATUS FOR DETECTING VOICE 有权

公开(公告)号：US20210210113A1

公开(公告)日：2021-07-08

申请号：US17208387

申请日：2021-03-22

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Xin Li , Bin Huang , Ce Zhang , Jinfeng Bai , Lei Jia

IPC: G10L25/30 , G10L15/02 , G10L25/78

Abstract: The present disclosure provides a method and apparatus for detecting a voice, relates to the fields of voice processing and deep learning technology. The method may include: acquiring a target voice; and inputting the target voice into a pre-trained deep neural network to obtain whether the target voice has a sub-voice in each of a plurality of preset direction intervals, the deep neural network being used to predict whether the voice has a sub-voice in each of the plurality of direction intervals.

6.

发明授权
Method and device for generating far-field speech data, computer device and computer readable storage medium 有权

公开(公告)号：US10861480B2

公开(公告)日：2020-12-08

申请号：US16228656

申请日：2018-12-20

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Jianwei Sun , Chao Li , Xin Li , Weixin Zhu , Ming Wen

IPC: H04R3/00 , H04R29/00 , G10L21/0272 , G10L21/0208 , G10L21/0316 , G10L25/78 , G10L25/48 , G10L15/06 , G10L21/00 , G10L25/21

Abstract: Embodiments of the present disclosure provide a method and a device for generating far-field speech data, a computer device and a computer readable storage medium. The method includes obtaining environmental noise in real environment and adjusting near-field speech data in a near-field speech data set based on the environmental noise, further includes generating far-field speech data based on adjusted near-field speech data and the environmental noise.

7.

发明授权
Image processing method, electronic device and storage medium for performing skin color recognition on a face image 有权

公开(公告)号：US11741684B2

公开(公告)日：2023-08-29

申请号：US17332520

申请日：2021-05-27

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Hao Sun , Fu Li , Xin Li , Tianwei Lin

IPC: G06V10/20 , G06V10/25 , G06T7/11 , G06T7/90 , G06T3/60 , G06V40/16 , G06V10/75 , G06V10/772 , G06V20/20

CPC classification number: G06V10/25 , G06T3/60 , G06T7/11 , G06T7/90 , G06V10/758 , G06V10/772 , G06V20/20 , G06V40/162 , G06T2207/30201

Abstract: The disclosure provides an image processing method, an image processing apparatus, an electronic device and a storage medium, which belongs to the field of computer technologies, and specifically relates to computing vision, image processing, face recognition, and deep learning technologies in artificial intelligence. The method includes: performing skin color recognition on a face image to be processed to determine a target skin color of a face contained in the face image; obtaining a reference transformation image corresponding to the face image by processing the face image using any style transfer model in response that a style transfer model set does not comprise a style transfer model corresponding to the target skin color; and obtaining a target transformation image matching the target skin color by adjusting a hue value, a saturation value, and a lightness value of each pixel in the target region based on the target skin color.

8.

发明授权
Method and apparatus for generating face image 有权

公开(公告)号：US11463631B2

公开(公告)日：2022-10-04

申请号：US17025255

申请日：2020-09-18

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Henan Zhang , Xin Li , Fu Li , Tianwei Lin , Hao Sun , Shilei Wen , Hongwu Zhang , Errui Ding

IPC: H04N5/262 , H04N5/232 , G06T5/00 , G06V40/16

Abstract: Embodiments of the present disclosure provide a method and apparatus for generating an image. The method may include: receiving a first image including a face input by a user in an interactive scene; presenting the first image to the user; inputting the first image into a pre-trained generative adversarial network in a backend to obtain a second image output by the generative adversarial network; where the generative adversarial network uses face attribute information generated based on the input image as a constraint; and presenting the second image to the user in response to obtaining the second image output by the generative adversarial network in the backend.

9.

发明申请
METHOD AND APPARATUS FOR GENERATING HUMAN BODY THREE-DIMENSIONAL MODEL, DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20210312686A1

公开(公告)日：2021-10-07

申请号：US17350449

申请日：2021-06-17

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Tianwei Lin , Fu Li , Xiaoqing Ye , Henan Zhang , Xin Li

IPC: G06T13/40 , G06T7/10 , G06T3/00 , G06T11/00

Abstract: The present disclosure discloses a method and apparatus for generating a human body three-dimensional model, a device and a storage medium. The method may include: receiving a single human body image, and extracting an SMPL human body three-dimensional model corresponding to the human body image and a PIFu human body three-dimensional model corresponding to the human body image; matching the SMPL human body three-dimensional model with the PIFu human body three-dimensional model to obtain a matching result; determining a vertex of the SMPL human body three-dimensional model closest to a vertex of the PIFu human body three-dimensional model based on the matching result to obtain a binding weight of the vertex of the PIFu human body three-dimensional model and each skeleton point of the SMPL human body three-dimensional model; and outputting a drivable human body three-dimensional model.

10.

发明授权
Method and apparatus for detecting temporal action of video, electronic device and storage medium 有权

公开(公告)号：US11600069B2

公开(公告)日：2023-03-07

申请号：US17144205

申请日：2021-01-08

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Tianwei Lin , Xin Li , Dongliang He , Fu Li , Hao Sun , Shilei Wen , Errui Ding

IPC: G06K9/00 , G06K9/62 , G06V20/40

Abstract: A method and apparatus for detecting a temporal action of a video, an electronic device and a storage medium are disclosed, which relates to the field of video processing technologies. An implementation includes: acquiring an initial temporal feature sequence of a video to be detected; acquiring, by a pre-trained video-temporal-action detecting module, implicit features and explicit features of a plurality of configured temporal anchor boxes based on the initial temporal feature sequence; and acquiring, by the video-temporal-action detecting module, the starting position and the ending position of a video clip containing a specified action, the category of the specified action and the probability that the specified action belongs to the category from the plural temporal anchor boxes according to the explicit features and the implicit features of the plural temporal anchor boxes.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification