-
1.
公开(公告)号:US11646017B1
公开(公告)日:2023-05-09
申请号:US17193414
申请日:2021-03-05
Applicant: Meta Platforms, Inc.
Inventor: Yangyang Shi , Yongqiang Wang , Chunyang Wu , Ching-Feng Yeh , Julian Yui-Hin Chan , Qiaochu Zhang , Duc Hoang Le , Michael Lewis Seltzer
IPC: G10L15/16 , G10L15/183 , G06N3/04 , G10L15/22
CPC classification number: G10L15/183 , G06N3/0445 , G10L15/16 , G10L15/22
Abstract: In one embodiment, a method includes accessing a machine-learning model configured to generate an encoding for an utterance by using a module to process data associated with each segment of the utterance in a series of iterations, performing operations associated with an i-th segment during an n-th iteration by the module, which include receiving an input comprising input contextual embeddings generated for the i-th segment in a preceding iteration and a memory bank storing memory vectors generated in the preceding iteration for segments preceding the i-th segment, generating attention outputs and a memory vector based on keys, values, and queries generated using the input, and generating output contextual embeddings for the i-th segment based on the attention outputs, providing the memory vector to the module for performing operations associated with the i-th segment in a next iteration, and performing speech recognition by decoding the encoding of the utterance.