-
公开(公告)号:US20230245654A1
公开(公告)日:2023-08-03
申请号:US18157413
申请日:2023-01-20
Applicant: Meta Platforms, Inc.
Inventor: Akshat Shrivastava , Shrey Desai , Anchit Gupta , Ali Elkahky , Aleksandr Livshits , Alexander Kolmykov-Zotov , Ahmed Aly , Jinsong Yu , Manali Anand Naik , Shuhui Yang , Baiyang Liu , Surya Teja Appini , Tarun Vir Singh , Hang Su , Jiedan Zhu , Fuchun Peng , Shoubhik Bhattacharya , Kshitiz Malik , Shreyan Bakshi , Akash Bharadwaj , Harish Srinivas , Xiao Yang , Zhuangqun Huang , Gil Keren , Duc Hoang Le , Ahmed Kamal Atwa Mohamed , Zhe Liu , Pranab Mohanty
CPC classification number: G10L15/22 , G10L15/1815 , G10L15/30 , G10L15/063 , G10L15/197 , H04L63/0428 , G10L2015/223 , G10L2015/086
Abstract: In one embodiment, a system includes an automatic speech recognition (ASR) module, a natural-language understanding (NLU) module, a dialog manager, one or more agents, an arbitrator, a delivery system, one or more processors, and a non-transitory memory coupled to the processors comprising instructions executable by the processors, the processors operable when executing the instructions to receive a user input, process the user input using the ASR module, the NLU module, the dialog manager, one or more of the agents, the arbitrator, and the delivery system, and provide a response to the user input.
-
公开(公告)号:US12087306B1
公开(公告)日:2024-09-10
申请号:US17535005
申请日:2021-11-24
Applicant: Meta Platforms, Inc.
Inventor: Duc Hoang Le , FNU Mahaveer , Gil Keren , Christian Fuegen , Yatharth Saraf
Abstract: In one embodiment, a method includes receiving a user's utterance comprising a word in a custom vocabulary list of the user, generating a previous token to represent a previous audio portion of the utterance, and generating a current token to represent a current audio portion of the utterance by generating a bias embedding by using the previous token to query a trie of wordpieces representing the custom vocabulary list, generating first probabilities of respective first candidate tokens likely uttered in the current audio portion based on the bias embedding and the current audio portion, generating second probabilities of respective second candidate tokens likely uttered after the previous token based on the previous token and the bias embedding, and generating the current token to represent the current audio portion of the utterance based on the first probabilities of the first candidate tokens and the second probabilities of the second candidate tokens.
-