SYSTEMS AND METHODS FOR A MULTILINGUAL SPEECH RECOGNITION FRAMEWORK

    公开(公告)号:US20220108688A1

    公开(公告)日:2022-04-07

    申请号:US17162624

    申请日:2021-01-29

    Abstract: Embodiments described herein provide an Adapt-and-Adjust (A2) mechanism for multilingual speech recognition model that combines both adaptation and adjustment methods as an integrated end-to-end training to improve the models' generalization and mitigate the long-tailed issue. Specifically, a multilingual language model mBERT is utilized, and converted into an autoregressive transformer decoder. In addition, a cross-attention module is added to the encoder on top of the mBERT's self-attention layer in order to explore the acoustic space in addition to the text space. The joint training of the encoder and mBERT decoder can bridge the semantic gap between the speech and the text.

Patent Agency Ranking