Efficient streaming non-recurrent on-device end-to-end model

发明授权

US11715458B2 Efficient streaming non-recurrent on-device end-to-end model 有权

请登陆查看更多内容

专利标题： Efficient streaming non-recurrent on-device end-to-end model
申请号： US17316198

申请日： 2021-05-10
公开(公告)号： US11715458B2

公开(公告)日： 2023-08-01
发明人: Tara Sainath , Arun Narayanan , Rami Botros , Yanzhang He , Ehsan Variani , Cyril Allauzen , David Rybach , Ruoming Pang , Trevor Strohman
申请人： Google LLC
申请人地址： US CA Mountain View
专利权人： Google LLC
当前专利权人： Google LLC
当前专利权人地址： US CA Mountain View
代理机构： Honigman LLP
代理商 Brett A. Krueger; Grant Griffith
主分类号： G10L15/00
IPC分类号： G10L15/00 ; G10L15/06 ; G10L15/02 ; G10L15/22 ; G10L15/30

Efficient streaming non-recurrent on-device end-to-end model

摘要：

An ASR model includes a first encoder configured to receive a sequence of acoustic frames and generate a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The ASR model also includes a second encoder configured to receive the first higher order feature representation generated by the first encoder at each of the plurality of output steps and generate a second higher order feature representation for a corresponding first higher order feature frame. The ASR model also includes a decoder configured to receive the second higher order feature representation generated by the second encoder at each of the plurality of output steps and generate a first probability distribution over possible speech recognition hypothesis. The ASR model also includes a language model configured to receive the first probability distribution over possible speech hypothesis and generate a rescored probability distribution.

公开/授权文献

US20220310062A1 Efficient Streaming Non-Recurrent On-Device End-to-End Model 公开/授权日：2022-09-29

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）