专利检索 cpc:"G10L15/063" 第 1 页

1.

发明公开
SERVER SUPPORTED RECOGNITION OF WAKE PHRASES 审中-公开

公开(公告)号：US20240363101A1

公开(公告)日：2024-10-31

申请号：US18771489

申请日：2024-07-12

申请人： SoundHound AI IP, LLC.

发明人： Newton Jain , Sameer Syed Zaheer

IPC分类号： G10L15/06 , G06F8/41 , G10L15/08 , G10L15/16

CPC分类号： G10L15/063 , G06F8/41 , G10L15/16 , G10L2015/088

摘要： A server supports multiple virtual assistants. It receives requests that include wake phrase audio and an identification of the source of the request, such as a virtual assistant device. Based on the identification, the server searches a database for a wake phrase detector appropriate for the identified source. The server then applies the wake phrase detector to the received wake phrase audio. If the wake phrase audio triggers the wake phrase detector, the server provides an appropriate response to the source.

2.

发明授权
Multiple wake words for systems with multiple smart assistants 有权

公开(公告)号：US12131523B2

公开(公告)日：2024-10-29

申请号：US17182951

申请日：2021-02-23

申请人： Meta Platforms, Inc.

发明人： Xiaohu Liu , Baiyang Liu , Rajen Subba

IPC分类号： G06V10/82 , G06F3/01 , G06F3/16 , G06F7/14 , G06F9/451 , G06F16/176 , G06F16/22 , G06F16/23 , G06F16/242 , G06F16/2455 , G06F16/2457 , G06F16/248 , G06F16/33 , G06F16/332 , G06F16/338 , G06F16/903 , G06F16/9032 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/00 , G06V10/764 , G06V20/10 , G06V40/20 , G10L15/02 , G10L15/06 , G10L15/07 , G10L15/16 , G10L15/18 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/28 , H04L41/00 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/50 , H04L67/5651 , H04L67/75 , H04W12/08 , G10L13/00 , G10L13/04 , H04L51/046 , H04L67/10 , H04L67/53

CPC分类号： G06V10/82 , G06F3/011 , G06F3/013 , G06F3/017 , G06F3/167 , G06F7/14 , G06F9/453 , G06F16/176 , G06F16/2255 , G06F16/2365 , G06F16/243 , G06F16/24552 , G06F16/24575 , G06F16/24578 , G06F16/248 , G06F16/3323 , G06F16/3329 , G06F16/3344 , G06F16/338 , G06F16/90332 , G06F16/90335 , G06F16/9038 , G06F16/904 , G06F16/951 , G06F16/9535 , G06F18/2411 , G06F40/205 , G06F40/295 , G06F40/30 , G06F40/40 , G06N3/006 , G06N3/08 , G06N7/01 , G06N20/00 , G06Q50/01 , G06V10/764 , G06V20/10 , G06V40/28 , G10L15/02 , G10L15/063 , G10L15/07 , G10L15/16 , G10L15/1815 , G10L15/1822 , G10L15/183 , G10L15/187 , G10L15/22 , G10L15/26 , G10L17/06 , G10L17/22 , H04L5/02 , H04L12/2816 , H04L41/20 , H04L41/22 , H04L43/0882 , H04L43/0894 , H04L51/02 , H04L51/18 , H04L51/216 , H04L51/52 , H04L67/306 , H04L67/535 , H04L67/5651 , H04L67/75 , H04W12/08 , G06F2216/13 , G10L13/00 , G10L13/04 , G10L2015/223 , G10L2015/225 , H04L51/046 , H04L67/10 , H04L67/53

摘要： In one embodiment, a method includes by a client system associated with a user, receiving, at the client system, a user input from the user, parsing, by the client system, the first user input to identify a request to execute a function to be performed by an assistant system of several assistant systems associated with the client system, determining whether the user is authorized to access the assistant system by comparing a voiceprint of the user to several voiceprints stored on the client system, sending, from the client system to the assistant system in response to determining the user is authorized to access the assistant system, a request to set an assistant xbot of the assistant system into a listening mode, and receiving, at the client system from the assistant system, an indication that the assistant xbot is in listening mode.

3.

发明授权
Ring enabling its wearer to enter control commands 有权

公开(公告)号：US12130965B2

公开(公告)日：2024-10-29

申请号：US17859747

申请日：2022-07-07

申请人： Plume Design, Inc.

发明人： Zhicheng Qiu , William J. McFarland

IPC分类号： G06F3/01 , G06F1/16 , G06F3/16 , G10L15/06 , H04W4/80

CPC分类号： G06F3/014 , G06F1/163 , G06F3/016 , G06F3/017 , G06F3/167 , G10L15/063 , H04W4/80

摘要： Control systems and methods are provided that utilize a device, which can be worn by a user, to enable the user to enter control commands for causing a controller to control one or more electronic devices in a local network, such as a Wi-Fi system. A local control system, according to one implementation, includes a smart ring configured to obtain movement information related to one or more movements of the smart ring while a user is wearing the smart ring. The local control system also includes a controller device configured to communicate with the smart ring using Bluetooth or Wi-Fi signals. Characteristics of the movement information can be translated in order to obtain one or more control commands. The controller device is configured to control one or more aspects of one or more electronic devices based on the one or more control commands.

4.

发明公开
SPEECH RECOGNITION DEVICE FOR DENTISTRY AND METHOD USING THE SAME 审中-公开

公开(公告)号：US20240355330A1

公开(公告)日：2024-10-24

申请号：US18226867

申请日：2023-07-27

申请人： DenComm Inc.

发明人： Byung Joon LIM

IPC分类号： G10L15/26 , G10L15/06

CPC分类号： G10L15/26 , G10L15/063

摘要： A speech recognition device for dentistry includes a processor, and a memory storing one or more instructions. The instructions are executed by the processor to obtain a sound containing noise and speech generated during a dental treatment, perform a noise cancelling on the sound, and run a speech-to-text (STT) model trained using a self-supervised learning method. The STT model executes extracting a feature from each of the sound with the noise cancelling and from the sound without the noise cancelling, obtaining an encoding vector by assigning a weight to each of the features and processing the features, and obtaining a script for the speech by decoding the encoding vector. Further, the self-supervised learning method includes a fine-tuning process in which sounds containing noises and speeches that are generated during the dental treatment and a script for each of the speeches are used as a training data.

5.

发明公开
INTELLIGENT VIRTUAL ASSISTANT TRAINING THROUGH PHASED OBSERVATIONAL LEARNING TASKS 审中-公开

公开(公告)号：US20240355318A1

公开(公告)日：2024-10-24

申请号：US18137995

申请日：2023-04-21

申请人： Verint Americas Inc.

发明人： Ian BEAVER

IPC分类号： G10L15/01 , G10L15/06

CPC分类号： G10L15/01 , G10L15/063

摘要： Disclosed embodiments pertain to training an intelligent virtual assistant through phased observational learning tasks. A pre-trained language model can be updated offline to produce a second language model with self-supervised learning based on transcripts of historical interactions between one or more customers, one or more customer service agents, and one or more data stores. The second language model can be evaluated and determined to satisfy a predetermined performance threshold. Subsequently, the second language model can be updated online to produce a third language model with reinforcement learning based on received customer input and similarity between a response provided by a customer service agent and a predicted response generated by the second language model. The third language model can then be deployed with an intelligent virtual assistant to respond to received user input.

6.

发明授权
Method and system for conversation transcription with metadata 有权

公开(公告)号：US12125487B2

公开(公告)日：2024-10-22

申请号：US17450551

申请日：2021-10-11

申请人： SoundHound, Inc.

发明人： Kiersten L. Bradley , Ethan Coeytaux , Ziming Yin

IPC分类号： G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/06 , G10L15/07

CPC分类号： G10L15/26 , G06F40/134 , G06F40/166 , G06F40/284 , G10L15/02 , G10L15/063 , G10L15/07 , G10L2015/0631

摘要： Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed and multiuser-editable transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript by one or more editors. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

7.

发明授权
Systems and methods for providing a sociolinguistic virtual assistant 有权

公开(公告)号：US12125479B2

公开(公告)日：2024-10-22

申请号：US17667483

申请日：2022-02-08

申请人： Seam Social Labs Inc

发明人： Tiasia O'Brien , Marisa Jean Dinko

IPC分类号： G10L15/18 , G10L15/06 , G10L15/22 , G10L15/30 , G10L25/63

CPC分类号： G10L15/1815 , G10L15/063 , G10L15/22 , G10L15/30 , G10L25/63 , G10L2015/223

摘要： A system for providing a sociolinguistic virtual assistant includes a communication device, a processing device, and a storage device. The processing device being configured to process input data using a natural language processing algorithm; categorize the semantic data based on psych-sociological categorizations associated with the at least one user; analyze the command from the at least one user to identify a task associated with the command; generate a response based on identification of the task associated with the command; execute the task associated with the command using categorized semantic data, to derive a result. A method corresponding to the system is also provided.

8.

发明授权
Providing contextual automated assistant action suggestion(s) via a vehicle computing device 有权

公开(公告)号：US12118994B2

公开(公告)日：2024-10-15

申请号：US17676646

申请日：2022-02-21

申请人： GOOGLE LLC

发明人： Sriram Natarajan , Yuxin Yu , Josh Brown , David Notario

IPC分类号： G10L15/22 , G06F3/0481 , G06F3/0488 , G06F3/16 , G06F9/451 , G10L15/06 , G10L15/08 , B60R16/037

CPC分类号： G10L15/22 , G06F3/0481 , G06F3/0488 , G06F3/167 , G06F9/453 , G10L15/063 , G10L15/08 , B60R16/0373 , G10L2015/088 , G10L2015/223 , G10L2015/227

摘要： Implementations set forth herein relate to an automated assistant that can provide suggestions for a user to interact with the automated assistant to control applications while in a vehicle. The suggestions can be provided to encourage hands-free interactions with the applications, by suggesting an assistant input that invokes the automated assistant to operate as an interface between the user and the applications. Assistant suggestions can be based on a context of a user and/or a context of the vehicle, such as content of a display interface of a device that the user is accessing while in the vehicle. For instance, the automated assistant can determine that an action that the user has employed an application to perform can be initialized more safely and/or in less time by utilizing a particular assistant input. This particular assistant input can then be rendered at an interface of a vehicle computing device.

9.

发明公开
SYSTEM AND METHOD FOR KEYWORD SPOTTING IN NOISY ENVIRONMENTS 审中-公开

公开(公告)号：US20240339123A1

公开(公告)日：2024-10-10

申请号：US18470788

申请日：2023-09-20

申请人： Samsung Electronics Co., Ltd.

发明人： Chou-Chang Yang , Yashas Malur Saidutta , Rakshith Sharma Srinivasa , Ching-Hua Lee , Yilin Shen , Hongxia Jin

IPC分类号： G10L21/0232 , G10L15/06 , G10L15/08 , G10L25/18

CPC分类号： G10L21/0232 , G10L15/063 , G10L15/08 , G10L25/18 , G10L2015/088

摘要： A method includes receiving an audio input and generating a noisy time-frequency representation based on the audio input. The method also includes providing the noisy time-frequency representation to a noise management model trained to predict a denoising mask and a signal presence probability (SPP) map indicating a likelihood of a presence of speech. The method further includes determining an enhanced spectrogram using the denoising mask and the noisy time-frequency representation. The method also includes providing the enhanced spectrogram and the SPP map as inputs to a keyword classification model trained to determine a likelihood of a keyword being present in the audio input. In addition, the method includes, responsive to determining that a keyword is in the audio input, transmitting the audio input to a downstream application associated with the keyword.

10.

发明授权
Electronic device for processing user utterance and method for operating same 有权

公开(公告)号：US12112751B2

公开(公告)日：2024-10-08

申请号：US17673972

申请日：2022-02-17

申请人： Samsung Electronics Co., Ltd.

发明人： Taegu Kim , Hyeonjae Bak , Yoonju Lee , Hansin Koh , Jooyeon Kim , Gajin Song , Jaeyung Yeo

IPC分类号： G10L15/22 , G10L15/06 , G10L15/18 , G10L15/30

CPC分类号： G10L15/22 , G10L15/063 , G10L15/1815 , G10L15/30 , G10L2015/0635 , G10L2015/223 , G10L2015/228

摘要： An electronic device, according to various embodiments, comprises a communication interface, a processor, and a memory. The memory may store instructions that, when executed, cause the processor to: obtain a user utterance; confirm context information associated with the user utterance; on the basis of the context information, select, as a target device, at least one external electronic device from among a plurality of external electronic devices; and via the communication interface, transmit at least a part of the context information to the at least one external electronic device selected as the target device. Various other embodiments are possible.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类