Patent search ap:("BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO. Page LTD.") AND inv:"Lei JIA"

1.

发明申请
CONTROL METHOD AND CONTROL APPARATUS FOR SPEECH INTERACTION 有权

公开(公告)号：US20210407494A1

公开(公告)日：2021-12-30

申请号：US17118869

申请日：2020-12-11

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Cong GAO , Saisai ZOU , Jinfeng BAI , Lei JIA

IPC: G10L15/08 , G10L15/22

Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.

2.

发明申请
METHOD AND APPARATUS FOR RECOGNIZING VOICE 有权

公开(公告)号：US20210233518A1

公开(公告)日：2021-07-29

申请号：US17209681

申请日：2021-03-23

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Xin LI , Bin HUANG , Ce ZHANG , Jinfeng BAI , Lei JIA

IPC: G10L15/16 , G10L25/18 , G10L15/22 , G10L15/32 , G10L15/06 , G10L15/197 , G06N3/08

Abstract: A method and an apparatus for recognizing a voice are provided. The method may include: inputting a target voice into a pre-trained voice recognition model to obtain an initial text output by at least one recognition network in the voice recognition model, the recognition network including a plurality of preset types of processing layers, and at least one type of processing layer of the recognition network being obtained by training based on a voice sample in a preset direction interval; and determining a voice recognition result of the target voice, based on the initial text.

3.

发明申请
METHOD FOR HUMAN-COMPUTER INTERACTION, APPARATUS FOR HUMAN-COMPUTER INTERACTION, DEVICE, AND STORAGE MEDIUM 有权

公开(公告)号：US20230058437A1

公开(公告)日：2023-02-23

申请号：US17706409

申请日：2022-03-28

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Zhen WU , Jiaxiang GE , Xiao WANG , Xianze SU , Bing LIU , Jiawei WANG , Dan WANG , Song YANG , Jinghao HAO , Yufang WU , Qin QU , Bingqi ZHANG , Xiaoyin FU , Siyuan WU , Chao LI , Cong GAO , Lei JIA

IPC: G10L15/22 , G10L15/34 , G10L13/027 , G06F40/40

Abstract: The present disclosure provides a method for a human-computer interaction, an apparatus for a human-computer interaction, a device, and a storage medium, and the present disclosure relates to the field of artificial intelligence, such as deep learning and voice. A specific implementation includes: acquiring a voice command; performing voice recognition on the voice command to determine a corresponding voice text; sending, in response to satisfying a preset information sending condition, the voice text to a cloud; receiving a resource for the voice command returned from the cloud; and responding to the voice command based on the resource.

4.

发明申请
CONTROL METHOD AND CONTROL APPARATUS FOR SPEECH INTERACTION, STORAGE MEDIUM AND SYSTEM 有权

公开(公告)号：US20210407496A1

公开(公告)日：2021-12-30

申请号：US17158726

申请日：2021-01-26

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Cong GAO , Saisai ZOU , Jinfeng BAI , Lei JIA

IPC: G10L15/08 , G10L15/22

Abstract: The present disclosure discloses a control method and a control apparatus for speech interaction. The detailed implementation solution of the control method for the speech interaction includes: collecting an audio signal; detecting a wake-up word in the audio signal to obtain a wake-up word result; and playing a prompt tone and/or executing a speech instruction in the audio signal based on the wake-up word result.

5.

发明申请
Method and Apparatus For Training Speech Spectrum Generation Model, and Electronic Device 有权

公开(公告)号：US20210201887A1

公开(公告)日：2021-07-01

申请号：US17205121

申请日：2021-03-18

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Zhijie CHEN , Tao SUN , Lei JIA

IPC: G10L13/047 , G10L25/18 , G10L25/30

Abstract: The present application discloses a method and an apparatus for training a speech spectrum generation model, as well as an electronic device, and relates to the technical field of speech synthesis and deep learning. A specific implementation is as follows: inputting a first text sequence into the speech spectrum generation model to generate an analog spectrum sequence corresponding to the first text sequence, and obtain a first loss value of the analog spectrum sequence according to a preset loss function; inputting the analog spectrum sequence corresponding to the first text sequence into an adversarial loss function model, which is a generative adversarial network model, to obtain a second loss value of the analog spectrum sequence; and training the speech spectrum generation model based on the first loss value and the second loss value.

Patent Agency Ranking