-
公开(公告)号:US11769480B2
公开(公告)日:2023-09-26
申请号:US17111238
申请日:2020-12-03
Inventor: Zhengkun Gao , Junteng Zhang , Wenfu Wang , Tao Sun
IPC: G10L13/00 , G10L13/047 , G10L13/06 , G10L13/10
CPC classification number: G10L13/047 , G10L13/06 , G10L13/10
Abstract: The present disclosure discloses a method and apparatus for training a model, a method and apparatus for synthesizing a speech, a device and a storage medium, and relates to the field of natural language processing and deep learning technology. The method for training a model may include: determining a phoneme feature and a prosodic word boundary feature of sample text data; inserting a pause character into the phoneme feature according to the prosodic word boundary feature to obtain a combined feature of the sample text data; and training an initial speech synthesis model according to the combined feature of the sample text data, to obtain a target speech synthesis model.
-
2.
公开(公告)号:US11488578B2
公开(公告)日:2022-11-01
申请号:US17205121
申请日:2021-03-18
Inventor: Zhijie Chen , Tao Sun , Lei Jia
IPC: G10L13/00 , G10L13/047 , G10L13/10 , G10L25/18 , G10L25/30
Abstract: The present application discloses a method and an apparatus for training a speech spectrum generation model, as well as an electronic device, and relates to the technical field of speech synthesis and deep learning. A specific implementation is as follows: inputting a first text sequence into the speech spectrum generation model to generate an analog spectrum sequence corresponding to the first text sequence, and obtain a first loss value of the analog spectrum sequence according to a preset loss function; inputting the analog spectrum sequence corresponding to the first text sequence into an adversarial loss function model, which is a generative adversarial network model, to obtain a second loss value of the analog spectrum sequence; and training the speech spectrum generation model based on the first loss value and the second loss value.
-