-
公开(公告)号:US20220108688A1
公开(公告)日:2022-04-07
申请号:US17162624
申请日:2021-01-29
Applicant: salesforce.com, inc.
Inventor: Guangsen Wang , Chu Hong Hoi , Genta Indra Winata
IPC: G10L15/16 , G10L15/065 , G10L15/06 , G06N3/04 , G06N3/08
Abstract: Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.
-
公开(公告)号:US11798534B2
公开(公告)日:2023-10-24
申请号:US17162624
申请日:2021-01-29
Applicant: salesforce.com, inc.
Inventor: Guangsen Wang , Chu Hong Hoi , Genta Indra Winata
IPC: G10L15/16 , G10L15/065 , G06N3/08 , G06N3/04 , G10L15/06
CPC classification number: G10L15/16 , G06N3/04 , G06N3/08 , G10L15/063 , G10L15/065
Abstract: Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.
-