MIXTURE-OF-EXPERT CONFORMER FOR STREAMING MULTILINGUAL ASR

Invention Publication

US20240304185A1 MIXTURE-OF-EXPERT CONFORMER FOR STREAMING MULTILINGUAL ASR 审中-公开

Please log in to see more content

Patent Title: MIXTURE-OF-EXPERT CONFORMER FOR STREAMING MULTILINGUAL ASR
Application No.: US18598885

Application Date: 2024-03-07
Publication No.: US20240304185A1

Publication Date: 2024-09-12
Inventor: Ke Hu , Bo Li , Tara N. Sainath , Yu Zhang , Francoise Beaufays
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Main IPC: G10L15/197
IPC: G10L15/197 ; G10L15/02 ; G10L15/06

MIXTURE-OF-EXPERT CONFORMER FOR STREAMING MULTILINGUAL ASR

Abstract:

A method of a multilingual ASR model includes receiving a sequence of acoustic frames characterizing an utterance of speech. At a plurality of output steps, the method further includes generating a first higher order feature representation for an acoustic frame by a first encoder that includes a first plurality of multi-head attention layers; generating a second higher order feature representation for a corresponding first higher order feature representation by a second encoder that includes a second plurality of multi-head attention layers; and generating, by a first decoder, a first probability distribution over possible speech recognition hypotheses based on the second higher order feature representation and a sequence of N previous non-blank symbols. A gating layer of each respective MoE layer configured to dynamically route an output from a previous multi-head attention layer at each of the plurality of output steps to a respective pair of feed-forward expert networks.

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/18	..利用自然语言模型
G10L15/183	...用上下文相关性，例如：语言模型
G10L15/19	....语法上下文，例如：基于字母顺序规则的识别假定的消除二义性
G10L15/197	.....概率文法，例如：字元语法