Patent search ap:("Beijing Baidu Netcom Science AND Technology Co. Page Ltd.") AND inv:"Lei Jia"

1.

发明授权
Method and apparatus for training speech spectrum generation model, and electronic device 有权

公开(公告)号：US11488578B2

公开(公告)日：2022-11-01

申请号：US17205121

申请日：2021-03-18

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Zhijie Chen , Tao Sun , Lei Jia

IPC: G10L13/00 , G10L13/047 , G10L13/10 , G10L25/18 , G10L25/30

Abstract: The present application discloses a method and an apparatus for training a speech spectrum generation model, as well as an electronic device, and relates to the technical field of speech synthesis and deep learning. A specific implementation is as follows: inputting a first text sequence into the speech spectrum generation model to generate an analog spectrum sequence corresponding to the first text sequence, and obtain a first loss value of the analog spectrum sequence according to a preset loss function; inputting the analog spectrum sequence corresponding to the first text sequence into an adversarial loss function model, which is a generative adversarial network model, to obtain a second loss value of the analog spectrum sequence; and training the speech spectrum generation model based on the first loss value and the second loss value.

2.

发明授权
Speech processing method and method for generating speech processing model 有权

公开(公告)号：US12118989B2

公开(公告)日：2024-10-15

申请号：US17507437

申请日：2021-10-21

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xu Chen , Jinfeng Bai , Runqiang Han , Lei Jia

IPC: G10L15/20 , G06N3/084 , G10L15/06 , G10L15/22 , G10L21/0208 , G10L21/0232 , G10L21/038 , G10L25/30

CPC classification number: G10L15/20 , G06N3/084 , G10L15/063 , G10L15/22 , G10L21/0232 , G10L21/038 , G10L25/30 , G10L2021/02082

Abstract: The present disclosure provides a speech processing method, and a method for generating a speech processing model, related to a field of signal processing technologies. The speech processing method includes: obtaining M speech signals to be processed and N reference signals; performing sub-band decomposition on each of the M speech signals and each of the N reference signals to obtain frequency-band components in each speech signal and each reference signal; processing the frequency-band components in each speech signal and each reference signal by using an echo cancellation model, to obtain an ideal ratio mask corresponding to the N reference signals in each frequency band of each speech signal; and performing echo cancellation on each frequency-band component of each speech signal based on the ideal ratio mask corresponding to the N reference signals in each frequency band of each speech signal, to obtain M echo-cancelled speech signals.

3.

发明授权
Control method and control apparatus for speech interaction, storage medium and system 有权

公开(公告)号：US11823662B2

公开(公告)日：2023-11-21

申请号：US17158726

申请日：2021-01-26

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Cong Gao , Saisai Zou , Jinfeng Bai , Lei Jia

IPC: G10L15/00 , G10L15/08 , G10L15/22 , G10L15/02 , G10L15/14

CPC classification number: G10L15/08 , G10L15/22 , G10L15/02 , G10L15/14 , G10L2015/088

Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.

4.

发明授权
Method and apparatus for recognizing voice 有权

公开(公告)号：US11735168B2

公开(公告)日：2023-08-22

申请号：US17209681

申请日：2021-03-23

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xin Li , Bin Huang , Ce Zhang , Jinfeng Bai , Lei Jia

IPC: G10L15/16 , G06N3/08 , G10L15/06 , G10L15/197 , G10L15/22 , G10L15/32 , G10L25/18 , G10L15/20

CPC classification number: G10L15/16 , G06N3/08 , G10L15/063 , G10L15/197 , G10L15/22 , G10L15/32 , G10L25/18 , G10L15/20 , G10L2015/0631

Abstract: A method and an apparatus for recognizing a voice are provided. The method may include: inputting a target voice into a pre-trained voice recognition model to obtain an initial text output by at least one recognition network in the voice recognition model, the recognition network including a plurality of preset types of processing layers, and at least one type of processing layer of the recognition network being obtained by training based on a voice sample in a preset direction interval; and determining a voice recognition result of the target voice, based on the initial text.

5.

发明授权
Control method and control apparatus for speech interaction 有权

公开(公告)号：US11615784B2

公开(公告)日：2023-03-28

申请号：US17118869

申请日：2020-12-11

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Cong Gao , Saisai Zou , Jinfeng Bai , Lei Jia

IPC: G10L15/00 , G10L15/08 , G10L15/22 , G10L15/02 , G10L15/14

Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.

6.

发明申请
METHOD AND APPARATUS FOR DETECTING VOICE 有权

公开(公告)号：US20210210113A1

公开(公告)日：2021-07-08

申请号：US17208387

申请日：2021-03-22

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Xin Li , Bin Huang , Ce Zhang , Jinfeng Bai , Lei Jia

IPC: G10L25/30 , G10L15/02 , G10L25/78

Abstract: The present disclosure provides a method and apparatus for detecting a voice, relates to the fields of voice processing and deep learning technology. The method may include: acquiring a target voice; and inputting the target voice into a pre-trained deep neural network to obtain whether the target voice has a sub-voice in each of a plurality of preset direction intervals, the deep neural network being used to predict whether the voice has a sub-voice in each of the plurality of direction intervals.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification