专利检索 cpc:"G10L15/063" 第 5 页

41.

发明授权
Emitting word timings with end-to-end models 有权

公开(公告)号：US12027154B2

公开(公告)日：2024-07-02

申请号：US18167050

申请日：2023-02-09

申请人： Google LLC

发明人： Tara N. Sainath , Basilio Garcia Castillo , David Rybach , Trevor Strohman , Ruoming Pang

IPC分类号： G10L25/30 , G10L15/06 , G10L25/78

CPC分类号： G10L15/063 , G10L25/30 , G10L25/78

摘要： A method includes receiving a training example that includes audio data representing a spoken utterance and a ground truth transcription. For each word in the spoken utterance, the method also includes inserting a placeholder symbol before the respective word identifying a respective ground truth alignment for a beginning and an end of the respective word, determining a beginning word piece and an ending word piece, and generating a first constrained alignment for the beginning word piece and a second constrained alignment for the ending word piece. The first constrained alignment is aligned with the ground truth alignment for the beginning of the respective word and the second constrained alignment is aligned with the ground truth alignment for the ending of the respective word. The method also includes constraining an attention head of a second pass decoder by applying the first and second constrained alignments.

42.

发明授权
Method to improve digital agent conversations 有权

公开(公告)号：US12020689B2

公开(公告)日：2024-06-25

申请号：US17718357

申请日：2022-04-12

申请人： International Business Machines Corporation

发明人： Mukundan Sundararajan , Jignesh K Karia , Sandipan Sarkar , Deepa Dubey

IPC分类号： G10L15/22 , G10L15/06 , G10L15/28 , G10L15/30

CPC分类号： G10L15/063 , G10L15/22 , G10L15/285 , G10L15/30 , G10L2015/0633

摘要： A computer-implemented method for virtual agent conversation training is disclosed. The computer-implemented method includes determining a current state of a first stage of a conversation between a pair of virtual agents. The computer-implemented method further includes determining a pivot distance between the current state of the first stage of the conversation and a subsequent, second stage of the conversation. The computer-implemented method further includes responsive to determining that the pivot distance between the current state of the first stage of the conversation and the subsequent, second stage of the conversation is below a predetermined threshold, determining an angle of dislocation with respect to the pivot distance. The computer-implemented method further includes terminating the conversation based, at least in part, on determining that the angle of dislocation is above a predetermined threshold.

43.

发明公开
LEXICON LEARNING-BASED HELIUMSPEECH UNSCRAMBLING METHOD IN SATURATION DIVING 审中-公开

公开(公告)号：US20240203436A1

公开(公告)日：2024-06-20

申请号：US18427869

申请日：2024-01-31

申请人： Nantong University

发明人： Shibing ZHANG , Jianrong WU , Lili GUO , Ming LI , Zhihua BAO

IPC分类号： G10L21/0208 , G10L15/06 , G10L15/16 , G10L25/51

CPC分类号： G10L21/0208 , G10L15/063 , G10L15/16 , G10L25/51 , G10L2015/0633

摘要： The present application relates to a lexicon learning-based heliumspeech unscrambling method in saturation diving. In a system including divers, a correction network, and an unscrambling network, a common working language lexicon for saturation diving operation is established and is read by the divers respectively in different environments, to generate supervision signals and vector signals of the correction network, and the correction network learns heliumspeeches of the different divers at different diving depths to obtain a correction network parameter, and corrects a heliumspeech of a diver to obtain a corrected speech; and the unscrambling network learns the corrected speech and completes unscrambling of the heliumspeech.

44.

发明公开
SYSTEMS AND METHODS FOR IMPROVED AUTOMATIC SPEECH RECOGNITION SYSTEMS 审中-公开

公开(公告)号：US20240203397A1

公开(公告)日：2024-06-20

申请号：US18066174

申请日：2022-12-14

申请人： Comcast Cable Communications, LLC

发明人： Raphael TANG , Karun KUMAR , Kendra CHALKLEY , Liming ZHANG , Wenyan LI , Pamela SHAPIRO , Yajie MAO , Gefei YANG , Jun Ho SHIN , Geoffrey Craig MURRAY

IPC分类号： G10L15/01 , G06F40/169 , G10L15/06 , G10L15/197 , G10L15/22

CPC分类号： G10L15/01 , G06F40/169 , G10L15/063 , G10L15/197 , G10L15/22

摘要： Selection of training utterances may be carried out in a sample-efficient manner, and the selected training utterances may be annotated to provide improved training information to an ASR system. A computing device may receive, from an ASR system, one or more transcript-score pairs, wherein a transcript-score pair comprises a transcription associated with a voice query and at least one score associated with the transcription. The computing device may determine a likelihood of a word error associated with each transcription of the one or more transcript-score pairs. The computing device may determine, based on the likelihood of the word error, an effect on a word-error rate of the ASR system. The computing device may send at least one of the one or more transcript-score pairs with a threshold effect on the word-error rate of the ASR system to be annotated.

45.

发明授权
Large-scale language model data selection for rare-word speech recognition 有权

公开(公告)号：US12014725B2

公开(公告)日：2024-06-18

申请号：US17643861

申请日：2021-12-13

申请人： Google LLC

发明人： Ronny Huang , Tara N. Sainath

IPC分类号： G10L15/16 , G06N3/02 , G10L15/06 , G10L15/197 , G10L15/22

CPC分类号： G10L15/063 , G06N3/02 , G10L15/16 , G10L15/197 , G10L15/22

摘要： A method of training a language model for rare-word speech recognition includes obtaining a set of training text samples, and obtaining a set of training utterances used for training a speech recognition model. Each training utterance in the plurality of training utterances includes audio data corresponding to an utterance and a corresponding transcription of the utterance. The method also includes applying rare word filtering on the set of training text samples to identify a subset of rare-word training text samples that include words that do not appear in the transcriptions from the set of training utterances or appear in the transcriptions from the set of training utterances less than a threshold number of times. The method further includes training the external language model on the transcriptions from the set of training utterances and the identified subset of rare-word training text samples.

46.

发明授权
System and method for data augmentation of feature-based voice data 有权

公开(公告)号：US12014722B2

公开(公告)日：2024-06-18

申请号：US17197587

申请日：2021-03-10

申请人： Microsoft Technology Licensing, LLC

发明人： Dushyant Sharma , Patrick A. Naylor , James W. Fosburgh

IPC分类号： G10L13/02 , G06F3/16 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/06 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/00

CPC分类号： G10L13/02 , G06F3/165 , G06N5/02 , G06N20/00 , G10K15/08 , G10L13/033 , G10L15/02 , G10L15/063 , G10L15/065 , G10L21/0224 , G10L25/03 , H04S7/30 , H04S7/302 , H04S7/303

摘要： A method, computer program product, and computing system for receiving feature-based voice data associated with a first acoustic domain. One or more gain-based augmentations may be performed on at least a portion of the feature-based voice data, thus defining gain-augmented feature-based voice data.

47.

发明授权
Systems and methods for grapheme-phoneme correspondence learning 有权

公开(公告)号：US12008921B2

公开(公告)日：2024-06-11

申请号：US18152625

申请日：2023-01-10

申请人： 617 Education Inc.

发明人： Tom Dillon

IPC分类号： G09B7/04 , G06F3/16 , G09B19/04 , G10L15/02 , G10L15/06 , G10L15/22 , G10L25/18 , G10L25/30

CPC分类号： G09B7/04 , G06F3/167 , G09B19/04 , G10L15/02 , G10L15/063 , G10L15/22 , G10L25/18 , G10L25/30 , G10L2015/025 , G10L2015/225

摘要： Systems and methods are described for grapheme-phoneme correspondence learning. In an example, a display of a device is caused to output a grapheme graphical user interface (GUI) that includes a grapheme. Audio data representative of a sound made by the human user is received based on the grapheme shown on the display. A grapheme-phoneme model can determine whether the sound made by the human corresponds to a phoneme for the displayed grapheme based on the audio data. The grapheme-phoneme model is trained based on augmented spectrogram data. A speaker is caused to output a sound representative of the phoneme for the grapheme to provide the human with a correct pronunciation of the grapheme in response to the grapheme-phoneme model determining that the sound made by the human does not correspond to the phoneme for the grapheme.

48.

发明公开
SYSTEM AND METHOD FOR KEYWORD FALSE ALARM REDUCTION 审中-公开

公开(公告)号：US20240185850A1

公开(公告)日：2024-06-06

申请号：US18352601

申请日：2023-07-14

申请人： Samsung Electronics Co., Ltd.

发明人： Rakshith Sharma Srinivasa , Yashas Malur Saidutta , Ching-Hua Lee , Chou-Chang Yang , Yilin Shen , Hongxia Jin

IPC分类号： G10L15/22 , G10L15/02 , G10L15/06 , G10L15/18 , G10L25/78

CPC分类号： G10L15/22 , G10L15/02 , G10L15/063 , G10L15/18 , G10L25/78 , G10L2015/088 , G10L2015/223

摘要： A method includes extracting, using a keyword detection model, audio features from audio data. The method also includes processing the audio features by a first layer of the keyword detection model configured to predict a first likelihood that the audio data includes speech. The method also includes processing the audio features by a second layer of the keyword detection model configured to predict a second likelihood that the audio data includes keyword-like speech. The method also includes processing the audio features by a third layer of the keyword detection model configured to predict a third likelihood, for each of a plurality of possible keywords, that the audio data includes the keyword. The method also includes identifying a keyword included in the audio data. The method also includes generating instructions to perform an action based at least in part on the identified keyword.

49.

发明公开
Modular Training for Flexible Attention Based End-to-End ASR 审中-公开

公开(公告)号：US20240185839A1

公开(公告)日：2024-06-06

申请号：US18526148

申请日：2023-12-01

申请人： Google LLC

发明人： Kartik AUDHKHASI , Bhuvana Ramabhadran , Brian Farris

IPC分类号： G10L15/06

CPC分类号： G10L15/063 , G10L2015/0635

摘要： A method for training a modular neural network model includes training only a backbone model to provide a first model configuration of the modular neural network model. The first model configuration includes only the trained backbone model. The method also includes adding an intrinsic sub-model to the trained backbone model. During a fine-tuning training stage, the method includes freezing parameters of the trained backbone model and fine-tuning parameters of the intrinsic sub-model added to the trained backbone model while the parameters of the trained backbone model are frozen to provide a second model configuration that includes the backbone model initially trained during the initial training stage and the intrinsic sub-model having the parameters fine-tuned during the fine-tuning stage.

50.

发明公开
SYSTEM AND METHOD FOR ACTIVE LEARNING BASED MULTILINGUAL SEMANTIC PARSER 审中-公开

公开(公告)号：US20240185838A1

公开(公告)日：2024-06-06

申请号：US18318225

申请日：2023-05-16

申请人： Openstream Inc.

发明人： Zhuang Li , Ghlolamreza Haffari , Rajasekhar Tumuluri , Philp R. Cohen

IPC分类号： G10L15/06 , G10L15/18

CPC分类号： G10L15/063 , G10L15/1822 , G10L2015/0635

摘要： Described is a system and method for training a multilingual semantic parser. A method includes receiving, by a multilingual semantic parser, a multilingual training dataset, wherein the multilingual training dataset includes pairs of utterances and meaning representations from at least one high-resource language and at least one low-resource language and wherein the multilingual training dataset is initially a machine-translated dataset, training, the multilingual semantic parser, by translating the utterances in the multilingual training dataset to a target language; and iteratively performing selecting, by an acquisition functions estimator, a subset of the multilingual training dataset for human translation, updating the multilingual training dataset with the human-translated subset of the multilingual training dataset with, and retraining, the multilingual semantic parser, with the updated multilingual training dataset.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类