- 专利标题: Two-pass end to end speech recognition
-
申请号: US17616135申请日: 2020-12-03
-
公开(公告)号: US12073824B2公开(公告)日: 2024-08-27
- 发明人: Tara N. Sainath , Yanzhang He , Bo Li , Arun Narayanan , Ruoming Pang , Antoine Jean Bruguier , Shuo-Yiin Chang , Wei Li
- 申请人: GOOGLE LLC
- 申请人地址: US CA Mountain View
- 专利权人: GOOGLE LLC
- 当前专利权人: GOOGLE LLC
- 当前专利权人地址: US CA Mountain View
- 代理机构: Gray Ice Higdon
- 国际申请: PCT/US2020/063012 2020.12.03
- 国际公布: WO2021/113443A 2021.06.10
- 进入国家日期: 2021-12-02
- 主分类号: G10L15/00
- IPC分类号: G10L15/00 ; G06N3/08 ; G10L15/05 ; G10L15/06 ; G10L15/16 ; G10L15/22
摘要:
Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.
公开/授权文献
- US20220238101A1 TWO-PASS END TO END SPEECH RECOGNITION 公开/授权日:2022-07-28
信息查询