Patent search ap:("Beijing Baidu Netcom Science AND Technology Co. Page Ltd.") AND inv:"Zhijie CHEN"

1.

发明申请
Method and Apparatus For Training Speech Spectrum Generation Model, and Electronic Device 有权

公开(公告)号：US20210201887A1

公开(公告)日：2021-07-01

申请号：US17205121

申请日：2021-03-18

Applicant: Beijing Baidu Netcom Science and Technology Co., Ltd.

Inventor： Zhijie CHEN , Tao SUN , Lei JIA

IPC: G10L13/047 , G10L25/18 , G10L25/30

Abstract: The present application discloses a method and an apparatus for training a speech spectrum generation model, as well as an electronic device, and relates to the technical field of speech synthesis and deep learning. A specific implementation is as follows: inputting a first text sequence into the speech spectrum generation model to generate an analog spectrum sequence corresponding to the first text sequence, and obtain a first loss value of the analog spectrum sequence according to a preset loss function; inputting the analog spectrum sequence corresponding to the first text sequence into an adversarial loss function model, which is a generative adversarial network model, to obtain a second loss value of the analog spectrum sequence; and training the speech spectrum generation model based on the first loss value and the second loss value.

2.

发明申请
METHOD AND APPARATUS FOR SPEECH RECOGNITION, AND STORAGE MEDIUM 有权

公开(公告)号：US20210375264A1

公开(公告)日：2021-12-02

申请号：US17123253

申请日：2020-12-16

Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventor： Liao ZHANG , Xiaoyin FU , Zhengxiang JIANG , Mingxin LIANG , Junyao SHAO , Qi ZHANG , Zhijie CHEN , Qiguang ZANG

IPC: G10L15/02 , G10L15/26 , G06F40/12 , G06F40/30 , G06F17/16 , G06F7/78

Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.

Patent Agency Ranking