-
1.
公开(公告)号:US20210201887A1
公开(公告)日:2021-07-01
申请号:US17205121
申请日:2021-03-18
Inventor: Zhijie CHEN , Tao SUN , Lei JIA
IPC: G10L13/047 , G10L25/18 , G10L25/30
Abstract: The present application discloses a method and an apparatus for training a speech spectrum generation model, as well as an electronic device, and relates to the technical field of speech synthesis and deep learning. A specific implementation is as follows: inputting a first text sequence into the speech spectrum generation model to generate an analog spectrum sequence corresponding to the first text sequence, and obtain a first loss value of the analog spectrum sequence according to a preset loss function; inputting the analog spectrum sequence corresponding to the first text sequence into an adversarial loss function model, which is a generative adversarial network model, to obtain a second loss value of the analog spectrum sequence; and training the speech spectrum generation model based on the first loss value and the second loss value.
-
公开(公告)号:US20210375264A1
公开(公告)日:2021-12-02
申请号:US17123253
申请日:2020-12-16
Inventor: Liao ZHANG , Xiaoyin FU , Zhengxiang JIANG , Mingxin LIANG , Junyao SHAO , Qi ZHANG , Zhijie CHEN , Qiguang ZANG
Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.
-