Input-feeding architecture for attention based end-to-end speech recognition

发明授权

US10672382B2 Input-feeding architecture for attention based end-to-end speech recognition 审中-公开

请登陆查看更多内容

专利标题： Input-feeding architecture for attention based end-to-end speech recognition
申请号： US16160352

申请日： 2018-10-15
公开(公告)号： US10672382B2

公开(公告)日： 2020-06-02
发明人: Chao Weng , Jia Cui , Guangsen Wang , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
申请人： TENCENT AMERICA LLC
申请人地址： US CA Palo Alto
专利权人： TENCENT AMERICA LLC
当前专利权人： TENCENT AMERICA LLC
当前专利权人地址： US CA Palo Alto
代理机构： Sughrue Mion, PLLC
主分类号： G10L15/06
IPC分类号： G10L15/06 ; G10L15/14 ; G10L15/183 ; G10L15/22

Input-feeding architecture for attention based end-to-end speech recognition

摘要：

Methods and apparatuses are provided for performing end-to-end speech recognition training performed by at least one processor. The method includes receiving, by the at least one processor, one or more input speech frames, generating, by the at least one processor, a sequence of encoder hidden states by transforming the input speech frames, computing, by the at least one processor, attention weights based on each of the sequence of encoder hidden states and a current decoder hidden state, performing, by the at least one processor, a decoding operation based on a previous embedded label prediction information and a previous attentional hidden state information generated based on the attention weights; and generating a current embedded label prediction information based on a result of the decoding operation and the attention weights.

公开/授权文献

US20200118547A1 INPUT-FEEDING ARCHITECTURE FOR ATTENTION BASED END-TO-END SPEECH RECOGNITION 公开/授权日：2020-04-16

信息查询

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）