专利检索 cpc:"G10L15/063" 第 4 页

31.

发明授权
Medical assessment based on voice 有权

公开(公告)号：US12051513B2

公开(公告)日：2024-07-30

申请号：US18347382

申请日：2023-07-05

申请人： Canary Speech, LLC

发明人： Jangwon Kim , Namhee Kwon , Henry O'Connell , Phillip Walstad , Kevin Shengbin Yang

IPC分类号： G16H80/00 , A61B5/00 , A61B5/11 , G06N3/08 , G06N7/01 , G06N20/10 , G10L25/66 , G16H10/20 , G16H40/67 , G16H50/20 , G16H50/50 , G06F111/10 , G10L15/02 , G10L15/06 , G10L15/22

CPC分类号： G16H80/00 , A61B5/1123 , A61B5/4088 , A61B5/4803 , A61B5/7267 , G06N3/08 , G06N7/01 , G06N20/10 , G10L25/66 , G16H10/20 , G16H40/67 , G16H50/20 , G16H50/50 , G06F2111/10 , G10L15/02 , G10L15/063 , G10L15/22

摘要： Apparatuses, systems, methods, and computer program products are disclosed for medical assessment based on voice. A query module is configured to audibly question a user from an electronic display screen and/or a speaker of a computing device with one or more open ended questions. A response module is configured to receive a conversational verbal response of a user from a microphone of a computing device in response to one or more open ended questions. A detection module is configured to provide a machine learning assessment for a user of a medical condition based on a machine learning analysis of a received conversational verbal response of the user.

32.

发明授权
Promoting voice actions to hotwords 有权

公开(公告)号：US12051408B2

公开(公告)日：2024-07-30

申请号：US16838966

申请日：2020-04-02

申请人： Google LLC

发明人： Matthew Sharifi

IPC分类号： G10L15/22 , G06F3/16 , G10L15/02 , G10L15/06 , G10L15/08 , G10L15/26 , G10L15/28 , G10L17/22

CPC分类号： G10L15/22 , G06F3/167 , G10L15/02 , G10L15/063 , G10L15/08 , G10L15/26 , G10L15/285 , G10L17/22 , G10L2015/088 , G10L2015/223

摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for designating certain voice commands as hotwords. The methods, systems, and apparatus include actions of receiving a hotword followed by a voice command. Additional actions include determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, where a voice command that is designated as a hotword is treated as a voice input regardless of whether the voice command is preceded by another hotword. Further actions include, in response to determining that the voice command satisfies one or more predetermined criteria associated with designating the voice command as a hotword, designating the voice command as a hotword.

33.

发明授权
Heliumspeech unscrambling method and system for saturation diving based on multi-objective optimization 有权

公开(公告)号：US12039988B1

公开(公告)日：2024-07-16

申请号：US18424695

申请日：2024-01-26

申请人： Nantong University

发明人： Shibing Zhang , Jianrong Wu

IPC分类号： G10L21/02 , G10L15/06 , G10L15/20 , G10L25/51

CPC分类号： G10L21/02 , G10L15/063 , G10L15/20 , G10L25/51 , G10L2015/0631

摘要： The present application discloses a method and a system for saturation diving heliumspeech unscrambling based on multi-objective optimization. In a system including a diver and a filter at least, a working language phonetic symbol library and a common working word library for divers are constructed. The divers read them one by one, and a phonetic symbol standard speech library, a phonetic symbol heliumspeech library and a common working word speech library are generated. The filter uses the multi-objective optimization algorithm to design its impulse response coefficients, corrects and unscrambles the tagged and sampled heliumspeech signal word by word, and continuously updates the impulse response coefficients to complete the perfect heliumspeech unscrambling.

34.

发明授权
Joint automatic speech recognition and speaker diarization 有权

公开(公告)号：US12039982B2

公开(公告)日：2024-07-16

申请号：US17601662

申请日：2020-04-06

申请人： Google LLC

发明人： Laurent El Shafey , Hagen Soltau , Izhak Shafran

IPC分类号： G10L15/22 , G10L15/26 , G10L15/30 , G10L17/18 , G10L15/06

CPC分类号： G10L17/18 , G10L15/22 , G10L15/26 , G10L15/30 , G10L15/063

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing audio data using neural networks.

35.

发明公开
SOUND SOURCE SEPARATION METHOD, SOUND SOURCE SEPARATION APPARATUS, AND PROGARM 失效

公开(公告)号：US20240233744A9

公开(公告)日：2024-07-11

申请号：US18275950

申请日：2021-02-08

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Naoki MAKISHIMA , Ryo MASUMURA

IPC分类号： G10L21/028 , G10L15/06 , G10L15/25

CPC分类号： G10L21/028 , G10L15/063 , G10L15/25

摘要： A mixed acoustic signal including sound emitted from a plurality of sound sources and sound source video signals representing at least one video of the plurality of sound sources are received as inputs, and at least a separated signal including a signal representing a target sound emitted from one sound source represented by the video is acquired. However, at least the separated signal is acquired using properties of the sound source that affects sound emitted by the sound source acquired from the video and/or features of a structure used for the sound source to emit the sound.

36.

发明公开
Event-based semantic search and retrieval 审中-公开

公开(公告)号：US20240233715A1

公开(公告)日：2024-07-11

申请号：US18118282

申请日：2023-03-07

申请人： Drift.com, Inc.

发明人： Jeffrey D. Orkin , Christopher M. Ward , Elias Torres

IPC分类号： G10L15/18 , G06F16/33 , G06N20/00 , G10L15/06

CPC分类号： G10L15/1815 , G06F16/3344 , G06N20/00 , G10L15/063

摘要： A technique for semantic search and retrieval that is event-based, wherein is event is composed of a sequence of observations that are user speech or physical actions. Using a first set of conversations, a machine learning model is trained against groupings of utterances therein to generate a speech act classifier. Observation sequences therein are organized into groupings of events and configured for subsequent event recognition. A set of second (unannotated) conversations are then received. The set of second conversations is evaluated using the speech act classifier and information retrieved from the event recognition to generate event-level metadata that comprises, for each utterance or physical action within an event, one or more associated tags. In response to a query, a search is performed against the metadata. Because the metadata is derived from event recognition, the search is performed against events learned from the set of first conversations. One or more conversation fragments that, from an event-based perspective, are semantically-relevant to the query, are returned.

37.

发明授权
Method for speech recognition based on language adaptivity and related apparatus 有权

公开(公告)号：US12033621B2

公开(公告)日：2024-07-09

申请号：US17231945

申请日：2021-04-15

申请人： Tencent Technology (Shenzhen) Company Limited

发明人： Dan Su , Tianxiao Fu , Min Luo , Qi Chen , Yulu Zhang , Lin Luo

IPC分类号： G10L15/187 , G10L15/00 , G10L15/02 , G10L15/06 , G10L15/22

CPC分类号： G10L15/187 , G10L15/005 , G10L15/02 , G10L15/063 , G10L15/22 , G10L2015/025

摘要： A method for speech recognition based on language adaptivity comprises obtaining voice data of a user. The method also comprises extracting, based on the obtained voice data, a phoneme feature representing pronunciation phoneme information. The phoneme feature is input to a pre-trained language discrimination model that is pre-trained based on a multilingual corpus. A language discrimination result corresponding to the phoneme feature and in accordance with the language discrimination model is obtained. The method also comprises obtaining a speech recognition result of the voice data based on a language acoustic model of a language corresponding to the language discrimination result. The method further comprises determining a speech recognition result of the voice data based on a language acoustic model of a language corresponding to the language discrimination result.

38.

发明公开
KEY PHRASE SPOTTING 审中-公开

公开(公告)号：US20240221750A1

公开(公告)日：2024-07-04

申请号：US18610233

申请日：2024-03-19

申请人： Google LLC

发明人： Wei Li , Rohit Prakash Prabhavalkar , Kanury Kanishka Rao , Yanzhang He , Ian C. McGraw , Anton Bakhtin

IPC分类号： G10L15/22 , G10L15/02 , G10L15/06 , G10L15/08 , G10L15/14 , G10L15/18 , G10L19/00

CPC分类号： G10L15/22 , G10L15/02 , G10L15/063 , G10L15/18 , G10L19/00 , G10L2015/025 , G10L2015/088 , G10L15/142 , G10L2015/223

摘要： Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.

39.

发明公开
VOICE RECOGNITION MODEL TRAINING METHOD, VOICE RECOGNITION METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM 审中-公开

公开(公告)号：US20240221727A1

公开(公告)日：2024-07-04

申请号：US18266432

申请日：2022-09-01

申请人： Beijing Baidu Netcom Science Technology Co., Ltd.

发明人： Lanhua YOU , Lei JIA , Qi ZHANG , Zhengxiang JIANG

IPC分类号： G10L15/06 , G10L15/01 , G10L15/02 , G10L15/16

CPC分类号： G10L15/063 , G10L15/01 , G10L15/02 , G10L15/16

摘要： The present disclosure provides a voice recognition model training method and apparatus, an electronic device and a storage medium, relating to the field of artificial intelligence technology, and in particular to the fields such as deep learning and voice recognition. The specific implementation scheme includes constructing a negative sample according to a positive sample to obtain a target negative sample for constraining a voice decoding path; obtaining training data according to the positive sample and the target negative sample; and training a first voice recognition model according to the training data to obtain a second voice recognition model.

40.

发明授权
Noise robust representations for keyword spotting systems 有权

公开(公告)号：US12027156B2

公开(公告)日：2024-07-02

申请号：US17677921

申请日：2022-02-22

申请人： Cypress Semiconductor Corporation

发明人： Aidan Smyth , Ashutosh Pandey , Avik Santra

IPC分类号： G10L15/10 , G10L15/04 , G10L15/06 , G10L15/08 , G10L15/22 , G10L25/18

CPC分类号： G10L15/10 , G10L15/04 , G10L15/063 , G10L15/22 , G10L25/18 , G10L2015/088 , G10L2015/223

摘要： Described are techniques for noise-robust and speaker-independent keyword spotting (KWS) in an input audio signal that contains keywords used to activate voice-based human-computer interactions. A KWS system may combine the latent representation generated by a denoising autoencoder (DAE) with audio features extracted from the audio signal using a machine learning approach. The DAE may be a discriminative DAE trained with a quadruplet loss metric learning approach to create a highly-separable latent representation of the audio signal in the audio input feature space. In one aspect, spectral characteristics of the audio signal such as Log-Mel features are combined with the latent representation generated by a quadruplet loss variational DAE (QVDQE) as input to a DNN KWS classifier. The KWS system improves keyword classification accuracy versus using extracted spectral features alone, non-discriminative DAE latent representations alone, or the extracted spectral features combined with the non-discriminative DAE latent representations in a KWS classifier.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类