-
公开(公告)号:US20240363101A1
公开(公告)日:2024-10-31
申请号:US18771489
申请日:2024-07-12
发明人: Newton Jain , Sameer Syed Zaheer
CPC分类号: G10L15/063 , G06F8/41 , G10L15/16 , G10L2015/088
摘要: A server supports multiple virtual assistants. It receives requests that include wake phrase audio and an identification of the source of the request, such as a virtual assistant device. Based on the identification, the server searches a database for a wake phrase detector appropriate for the identified source. The server then applies the wake phrase detector to the received wake phrase audio. If the wake phrase audio triggers the wake phrase detector, the server provides an appropriate response to the source.
-
公开(公告)号:US12131523B2
公开(公告)日:2024-10-29
申请号:US17182951
申请日:2021-02-23
申请人: Meta Platforms, Inc.
发明人: Xiaohu Liu , Baiyang Liu , Rajen Subba
IPC分类号: G06V10/82 , G06F3/01 , G06F3/16 , G06F7/14 , G06F9/451 , G06F16/176 , G06F16/22 , G06F16/23 , G06F16/242 , G06F16/2455 , G06F16/2457 , G06F16/248 , G06F16/33 , G06F16/332 , G06F16/338 , G06F16/903 , G06F16/9032 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/00 , G06V10/764 , G06V20/10 , G06V40/20 , G10L15/02 , G10L15/06 , G10L15/07 , G10L15/16 , G10L15/18 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/28 , H04L41/00 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/50 , H04L67/5651 , H04L67/75 , H04W12/08 , G10L13/00 , G10L13/04 , H04L51/046 , H04L67/10 , H04L67/53
CPC分类号: G06V10/82 , G06F3/011 , G06F3/013 , G06F3/017 , G06F3/167 , G06F7/14 , G06F9/453 , G06F16/176 , G06F16/2255 , G06F16/2365 , G06F16/243 , G06F16/24552 , G06F16/24575 , G06F16/24578 , G06F16/248 , G06F16/3323 , G06F16/3329 , G06F16/3344 , G06F16/338 , G06F16/90332 , G06F16/90335 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/01 , G06V10/764 , G06V20/10 , G06V40/28 , G10L15/02 , G10L15/063 , G10L15/07 , G10L15/16 , G10L15/1815 , G10L15/1822 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/2816 , H04L41/20 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/535 , H04L67/5651 , H04L67/75 , H04W12/08 , G06F2216/13 , G10L13/00 , G10L13/04 , G10L2015/223 , G10L2015/225 , H04L51/046 , H04L67/10 , H04L67/53
摘要: In one embodiment, a method includes by a client system associated with a user, receiving, at the client system, a user input from the user, parsing, by the client system, the first user input to identify a request to execute a function to be performed by an assistant system of several assistant systems associated with the client system, determining whether the user is authorized to access the assistant system by comparing a voiceprint of the user to several voiceprints stored on the client system, sending, from the client system to the assistant system in response to determining the user is authorized to access the assistant system, a request to set an assistant xbot of the assistant system into a listening mode, and receiving, at the client system from the assistant system, an indication that the assistant xbot is in listening mode.
-
公开(公告)号:US12130965B2
公开(公告)日:2024-10-29
申请号:US17859747
申请日:2022-07-07
申请人: Plume Design, Inc.
发明人: Zhicheng Qiu , William J. McFarland
摘要: Control systems and methods are provided that utilize a device, which can be worn by a user, to enable the user to enter control commands for causing a controller to control one or more electronic devices in a local network, such as a Wi-Fi system. A local control system, according to one implementation, includes a smart ring configured to obtain movement information related to one or more movements of the smart ring while a user is wearing the smart ring. The local control system also includes a controller device configured to communicate with the smart ring using Bluetooth or Wi-Fi signals. Characteristics of the movement information can be translated in order to obtain one or more control commands. The controller device is configured to control one or more aspects of one or more electronic devices based on the one or more control commands.
-
公开(公告)号:US20240355330A1
公开(公告)日:2024-10-24
申请号:US18226867
申请日:2023-07-27
申请人: DenComm Inc.
发明人: Byung Joon LIM
CPC分类号: G10L15/26 , G10L15/063
摘要: A speech recognition device for dentistry includes a processor, and a memory storing one or more instructions. The instructions are executed by the processor to obtain a sound containing noise and speech generated during a dental treatment, perform a noise cancelling on the sound, and run a speech-to-text (STT) model trained using a self-supervised learning method. The STT model executes extracting a feature from each of the sound with the noise cancelling and from the sound without the noise cancelling, obtaining an encoding vector by assigning a weight to each of the features and processing the features, and obtaining a script for the speech by decoding the encoding vector. Further, the self-supervised learning method includes a fine-tuning process in which sounds containing noises and speeches that are generated during the dental treatment and a script for each of the speeches are used as a training data.
-
公开(公告)号:US20240355318A1
公开(公告)日:2024-10-24
申请号:US18137995
申请日:2023-04-21
申请人: Verint Americas Inc.
发明人: Ian BEAVER
CPC分类号: G10L15/01 , G10L15/063
摘要: Disclosed embodiments pertain to training an intelligent virtual assistant through phased observational learning tasks. A pre-trained language model can be updated offline to produce a second language model with self-supervised learning based on transcripts of historical interactions between one or more customers, one or more customer service agents, and one or more data stores. The second language model can be evaluated and determined to satisfy a predetermined performance threshold. Subsequently, the second language model can be updated online to produce a third language model with reinforcement learning based on received customer input and similarity between a response provided by a customer service agent and a predicted response generated by the second language model. The third language model can then be deployed with an intelligent virtual assistant to respond to received user input.
-
公开(公告)号:US12125487B2
公开(公告)日:2024-10-22
申请号:US17450551
申请日:2021-10-11
申请人: SoundHound, Inc.
发明人: Kiersten L. Bradley , Ethan Coeytaux , Ziming Yin
IPC分类号: G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/06 , G10L15/07
CPC分类号: G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/063 , G10L15/07 , G10L2015/0631
摘要: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.
-
公开(公告)号:US12125479B2
公开(公告)日:2024-10-22
申请号:US17667483
申请日:2022-02-08
申请人: Seam Social Labs Inc
发明人: Tiasia O'Brien , Marisa Jean Dinko
CPC分类号: G10L15/1815 , G10L15/063 , G10L15/22 , G10L15/30 , G10L25/63 , G10L2015/223
摘要: A system for providing a sociolinguistic virtual assistant includes a communication device, a processing device, and a storage device. The processing device being configured to process input data using a natural language processing algorithm; categorize the semantic data based on psych-sociological categorizations associated with the at least one user; analyze the command from the at least one user to identify a task associated with the command; generate a response based on identification of the task associated with the command; execute the task associated with the command using categorized semantic data, to derive a result. A method corresponding to the system is also provided.
-
8.
公开(公告)号:US12118994B2
公开(公告)日:2024-10-15
申请号:US17676646
申请日:2022-02-21
申请人: GOOGLE LLC
发明人: Sriram Natarajan , Yuxin Yu , Josh Brown , David Notario
IPC分类号: G10L15/22 , G06F3/0481 , G06F3/0488 , G06F3/16 , G06F9/451 , G10L15/06 , G10L15/08 , B60R16/037
CPC分类号: G10L15/22 , G06F3/0481 , G06F3/0488 , G06F3/167 , G06F9/453 , G10L15/063 , G10L15/08 , B60R16/0373 , G10L2015/088 , G10L2015/223 , G10L2015/227
摘要: Implementations set forth herein relate to an automated assistant that can provide suggestions for a user to interact with the automated assistant to control applications while in a vehicle. The suggestions can be provided to encourage hands-free interactions with the applications, by suggesting an assistant input that invokes the automated assistant to operate as an interface between the user and the applications. Assistant suggestions can be based on a context of a user and/or a context of the vehicle, such as content of a display interface of a device that the user is accessing while in the vehicle. For instance, the automated assistant can determine that an action that the user has employed an application to perform can be initialized more safely and/or in less time by utilizing a particular assistant input. This particular assistant input can then be rendered at an interface of a vehicle computing device.
-
公开(公告)号:US20240339123A1
公开(公告)日:2024-10-10
申请号:US18470788
申请日:2023-09-20
发明人: Chou-Chang Yang , Yashas Malur Saidutta , Rakshith Sharma Srinivasa , Ching-Hua Lee , Yilin Shen , Hongxia Jin
IPC分类号: G10L21/0232 , G10L15/06 , G10L15/08 , G10L25/18
CPC分类号: G10L21/0232 , G10L15/063 , G10L15/08 , G10L25/18 , G10L2015/088
摘要: A method includes receiving an audio input and generating a noisy time-frequency representation based on the audio input. The method also includes providing the noisy time-frequency representation to a noise management model trained to predict a denoising mask and a signal presence probability (SPP) map indicating a likelihood of a presence of speech. The method further includes determining an enhanced spectrogram using the denoising mask and the noisy time-frequency representation. The method also includes providing the enhanced spectrogram and the SPP map as inputs to a keyword classification model trained to determine a likelihood of a keyword being present in the audio input. In addition, the method includes, responsive to determining that a keyword is in the audio input, transmitting the audio input to a downstream application associated with the keyword.
-
公开(公告)号:US12112751B2
公开(公告)日:2024-10-08
申请号:US17673972
申请日:2022-02-17
发明人: Taegu Kim , Hyeonjae Bak , Yoonju Lee , Hansin Koh , Jooyeon Kim , Gajin Song , Jaeyung Yeo
CPC分类号: G10L15/22 , G10L15/063 , G10L15/1815 , G10L15/30 , G10L2015/0635 , G10L2015/223 , G10L2015/228
摘要: An electronic device, according to various embodiments, comprises a communication interface, a processor, and a memory. The memory may store instructions that, when executed, cause the processor to: obtain a user utterance; confirm context information associated with the user utterance; on the basis of the context information, select, as a target device, at least one external electronic device from among a plurality of external electronic devices; and via the communication interface, transmit at least a part of the context information to the at least one external electronic device selected as the target device. Various other embodiments are possible.
-
-
-
-
-
-
-
-
-