-
1.
公开(公告)号:US11646017B1
公开(公告)日:2023-05-09
申请号:US17193414
申请日:2021-03-05
Applicant: Meta Platforms, Inc.
Inventor: Yangyang Shi , Yongqiang Wang , Chunyang Wu , Ching-Feng Yeh , Julian Yui-Hin Chan , Qiaochu Zhang , Duc Hoang Le , Michael Lewis Seltzer
IPC: G10L15/16 , G10L15/183 , G06N3/04 , G10L15/22
CPC classification number: G10L15/183 , G06N3/0445 , G10L15/16 , G10L15/22
Abstract: In one embodiment, a method includes accessing a machine-learning model configured to generate an encoding for an utterance by using a module to process data associated with each segment of the utterance in a series of iterations, performing operations associated with an i-th segment during an n-th iteration by the module, which include receiving an input comprising input contextual embeddings generated for the i-th segment in a preceding iteration and a memory bank storing memory vectors generated in the preceding iteration for segments preceding the i-th segment, generating attention outputs and a memory vector based on keys, values, and queries generated using the input, and generating output contextual embeddings for the i-th segment based on the attention outputs, providing the memory vector to the module for performing operations associated with the i-th segment in a next iteration, and performing speech recognition by decoding the encoding of the utterance.
-
2.
公开(公告)号:US20250103831A1
公开(公告)日:2025-03-27
申请号:US18471568
申请日:2023-09-21
Applicant: META PLATFORMS, INC.
Inventor: Zeeshan Ahmed , Frank Torsten Bernd Seide , Yangyang Shi
IPC: G06F40/58 , G06F3/16 , G06F40/103 , G06V20/20 , G06V20/62
Abstract: Head-mounted displays may include a machine translation model designed to recognize text through optical character recognition or automatic speech recognition, and may translate the text from its original language to another language. The machine translation model may be trained to modify source text using various tasks, thus allowing the machine translation model to learn different versions of the source text in several different versions. The source text and a variation(s) derived from a task(s) may be mapped to a target text, representing the properly translated and formatted version of the source text. The machine translation model may provide a single model, to facilitate machine translation, implemented on the head-mounted display. Also, the machine translation model may include a bilingual machine translation model that may translate source text from one language to another language, and vice versa.
-
公开(公告)号:US20230135179A1
公开(公告)日:2023-05-04
申请号:US17938561
申请日:2022-10-06
Applicant: Meta Platforms, Inc.
Inventor: Sebastian Jonathan Mielke , Arthur David Szlam , Emily Dinan , Y-Lan Boureau , Mokhtar Mohamed Khorshid , Jeremy Dohmann , Brian Moran , Lintao Cui , Jonathan Richard Goetz , Ahmed Kamal Atwa Mohamed , Paul Anthony Crook , Andrea Madotto , Shrey Desai , Alexander Kolmykov-Zotov , Jason Pazis , Zhaojun Yang , Haichuan Yang , Yangyang Shi , Biqiao Zhang , Ivaylo Enchev , Xin Lei , Ming Sun
Abstract: In one embodiment, a system includes an automatic speech recognition (ASR) module, a natural-language understanding (NLU) module, a dialog manager, one or more agents, an arbitrator, a delivery system, one or more processors, and a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to receive a user input, process the user input using the ASR module, the NLU module, the dialog manager, one or more of the agents, the arbitrator, and the delivery system, and provide a response to the user input.
-
-