Patent search ap:("BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO. Page LTD.") AND inv:"Lei Jia"

11.

发明授权
Method and apparatus of synthesizing speech, method and apparatus of training speech synthesis model, electronic device, and storage medium 有权

公开(公告)号：US11769482B2

公开(公告)日：2023-09-26

申请号：US17489616

申请日：2021-09-29

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Wenfu Wang , Tao Sun , Xilei Wang , Junteng Zhang , Zhengkun Gao , Lei Jia

IPC: G10L13/10 , G10L25/30

CPC classification number: G10L13/10 , G10L25/30

Abstract: The present disclosure provides a method and apparatus of synthesizing a speech, a method and apparatus of training a speech synthesis model, an electronic device, and a storage medium. The method of synthesizing a speech includes acquiring a style information of a speech to be synthesized, a tone information of the speech to be synthesized, and a content information of a text to be processed; generating an acoustic feature information of the text to be processed, by using a pre-trained speech synthesis model, based on the style information, the tone information, and the content information of the text to be processed; and synthesizing the speech for the text to be processed, based on the acoustic feature information of the text to be processed.

12.

发明申请
METHOD AND APPARATUS FOR ALLOCATING MEMORY AND ELECTRONIC DEVICE 有权

公开(公告)号：US20220147441A1

公开(公告)日：2022-05-12

申请号：US17454900

申请日：2021-11-15

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Chao Tian , Lei Jia

IPC: G06F12/02 , G10L15/28 , G10L15/22 , G10L15/16

Abstract: The disclosure provides a method and an apparatus for allocating memory, and an electronic device. Multiple frames of speech data are received and input to a neural network model. The neural network model is configured to ask for multiple data tensors when processing the multiple frames of speech data, and the multiple data tensors share a common memory.

13.

发明授权
Speech synthesis method, and electronic device 有权

公开(公告)号：US12211485B2

公开(公告)日：2025-01-28

申请号：US17820339

申请日：2022-08-17

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Zhengkun Gao , Junteng Zhang , Tao Sun , Lei Jia

IPC: G10L13/00 , G10L13/047 , G10L13/08

Abstract: The disclosure provides a speech synthesis method, and an electronic device. The technical solution is described as follows. A text to be synthesized and speech features of a target user are obtained. Predicted first acoustic features based on the text to be synthesized and the speech features are obtained. A target template audio is obtained from a template audio library based on the text to be synthesized. Second acoustic features of the target template audio are extracted. Target acoustic features are generated by splicing the first acoustic features and the second acoustic features. Speech synthesis is performed on the text to be synthesized based on the target acoustic features and the speech features, to generate a target speech of the text to be synthesized.

14.

发明申请
METHOD AND APPARATUS FOR PROCESSING SPEECH, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20230015112A1

公开(公告)日：2023-01-19

申请号：US17933152

申请日：2022-09-19

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Jiankang Hou , Tao Sun , Zhipeng Nie , Liqiang Zhang , Lei Jia , Haifeng Wang

IPC: G10L21/10 , G10L13/02 , G10L21/0208 , G10L25/51

Abstract: A method for processing a speech includes: acquiring an original speech; extracting a spectrogram from the original speech; acquiring a speech synthesis model, where the speech synthesis model comprises a first generation sub-model and a second generation sub-model; generating a harmonic structure of the spectrogram, by invoking the first generation sub-model to process the spectrogram; and generating a target speech, by invoking the second generation sub-model to process the harmonic structure and the spectrogram.

15.

发明申请
METHOD FOR TRAINING DATA PROCESSING MODEL, ELECTRONIC DEVICE AND STORAGE MEDIUM 有权

公开(公告)号：US20220207427A1

公开(公告)日：2022-06-30

申请号：US17655253

申请日：2022-03-17

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Yangkai Xu , Guibin Wang , Xiaoyin Fu , Zhijie Chen , Mingshun Yang , Shijun Cong , Ming Jia , Lei Jia

IPC: G06N20/00

Abstract: A method for training a data processing model includes: acquiring sample data; acquiring an initial data processing model, the initial data processing model including a plurality of forward nodes for outputting a plurality of intermediate results corresponding to the sample data; determining a plurality of time-dependent features corresponding to the plurality of forward nodes; acquiring a data processing model to be trained by processing the initial data processing model based on the plurality of time-dependent features; and training the data processing model to be trained using the sample data and the plurality of intermediate results.

16.

发明授权
Method of recognizing speech offline, electronic device, and storage medium 有权

公开(公告)号：US12183323B2

公开(公告)日：2024-12-31

申请号：US17644749

申请日：2021-12-16

Applicant: Beijing Baidu Netcom Science Technology Co., Ltd.

Inventor： Xiaoyin Fu , Mingxin Liang , Zhijie Chen , Qiguang Zang , Zhengxiang Jiang , Liao Zhang , Qi Zhang , Lei Jia

IPC: G10L15/02 , G10L15/16 , G10L19/032

Abstract: The present disclosure provides a method of recognizing speech offline, electronic device, and a storage medium, relating to a field of artificial intelligence such as speech recognition, natural language processing, and deep learning. The method may include: decoding speech data to be recognized into a syllable recognition result; transforming the syllable recognition result into a corresponding text as a speech recognition result of the speech data.

17.

发明授权
Task processing method and apparatus, electronic device and storage medium 有权

公开(公告)号：US11640319B1

公开(公告)日：2023-05-02

申请号：US17945166

申请日：2022-09-15

Applicant: BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD.

Inventor： Gang Ji , Chao Tian , Lei Jia

IPC: G06F9/48 , G06F12/0802 , G06F9/50

Abstract: A task processing method, an electronic device and a storage medium, which relate to the field of artificial intelligence, such as intelligent voices, artificial intelligence chips, or the like, are disclosed. The method may include: for to-be-executed tasks, in at least one round of processing, performing the following operations: in response to determining that one or more high-priority tasks exist in the to-be-executed tasks, calling the one or more high-priority tasks to process audio data cached in a memory; and after execution of the one or more high-priority tasks is completed, and in response to determining that one or more low-priority task exist in the to-be-executed tasks, calling the one or more low-priority tasks to process the audio data.

Patent Agency Ranking