专利检索 cpc:"G10L15/063" 第 7 页

61.

发明授权
On-device speech synthesis of textual segments for training of on-device speech recognition model 有权

公开(公告)号：US11978432B2

公开(公告)日：2024-05-07

申请号：US18204324

申请日：2023-05-31

申请人： GOOGLE LLC

发明人： Françoise Beaufays , Johan Schalkwyk , Khe Chai Sim

IPC分类号： G10L13/047 , G10L15/06

CPC分类号： G10L13/047 , G10L15/063 , G10L2015/0635

摘要： Processor(s) of a client device can: identify a textual segment stored locally at the client device; process the textual segment, using a speech synthesis model stored locally at the client device, to generate synthesized speech audio data that includes synthesized speech of the identified textual segment; process the synthesized speech, using an on-device speech recognition model that is stored locally at the client device, to generate predicted output; and generate a gradient based on comparing the predicted output to ground truth output that corresponds to the textual segment. In some implementations, the generated gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model. In some implementations, the generated gradient is additionally or alternatively transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

62.

发明公开
LEARNING APPARATUS, ESTIMATION APPARATUS, METHODS AND PROGRAMS FOR THE SAME 审中-公开

公开(公告)号：US20240144912A1

公开(公告)日：2024-05-02

申请号：US18280159

申请日：2021-03-10

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Junji WATANABE , Aiko MURATA

IPC分类号： G10L15/04 , G10L15/02 , G10L15/06

CPC分类号： G10L15/04 , G10L15/02 , G10L15/063

摘要： An estimation apparatus includes an estimation unit that estimates a future incident occurrence quantitative value in a region on the basis of at least two or more inputted psychological-state/sensibility expressing words emitted in a predetermined region and the input order of the two or more psychological-state/sensibility expressing words, using an estimation model for estimating an incident occurrence quantitative value that is a quantitative value of an occurrence of a predetermined event in the region after a certain time, with an input being at least a time series of two or more psychological-state/sensibility expressing words emitted in the predetermined region before the certain time.

63.

发明授权
Voice communication analysis system 有权

公开(公告)号：US11967307B2

公开(公告)日：2024-04-23

申请号：US17174845

申请日：2021-02-12

申请人： Oracle International Corporation

发明人： Suraj Shinde

IPC分类号： G10L15/16 , G06N3/045 , G06N20/00 , G10L15/06 , G10L15/22

CPC分类号： G10L15/16 , G06N3/045 , G06N20/00 , G10L15/063 , G10L15/22 , G10L2015/223

摘要： Techniques are disclosed for applying a trained machine learning model to incoming voice communications to determine whether the voice communications are genuine or not genuine. The trained machine learning model may identify vocal attributes within the target call and use the identified attributes, and the training, determine whether the target call is genuine or not genuine. An applied trained machine learning model may include multiple different types of trained machine learning models, where each of different types of machine learning models is trained and/or configured for a different function within the analysis.

64.

发明公开
DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE 审中-公开

公开(公告)号：US20240127801A1

公开(公告)日：2024-04-18

申请号：US17965226

申请日：2022-10-13

申请人： International Business Machines Corporation

发明人： Tohru Nagano , Gakuto Kurata

IPC分类号： G10L15/16 , G10L15/02 , G10L15/06 , G10L15/30

CPC分类号： G10L15/16 , G10L15/02 , G10L15/063 , G10L15/30 , G10L2015/022

摘要： Methods, systems, and computer program products for domain adaptive speech recognition using artificial intelligence are provided herein. A computer-implemented method includes generating a set of language data candidates, each language data candidate comprising one or more graphemes, by processing a sequence of phonemes related to input speech data using an artificial intelligence-based data conversion model; determining, for a target pair of phonemes and graphemes, a subset of graphemes from the set of language data candidates; generating a first speech recognition output by processing the subset of graphemes using at least one biasing language model and an artificial intelligence-based speech recognition model; generating a second speech recognition output by replacing at least a portion of the subset of graphemes in the first speech recognition output with at least one of the graphemes from the target pair; and performing automated actions based on the second speech recognition output.

65.

发明公开
LEARNING APPARATUS, ESTIMATION APPARATUS, METHODS AND PROGRAMS FOR THE SAME 审中-公开

公开(公告)号：US20240127796A1

公开(公告)日：2024-04-18

申请号：US18277552

申请日：2021-02-18

申请人： NIPPON TELEGRAPH AND TELEPHONE CORPORATION

发明人： Hiroshi SATO , Takaaki FUKUTOMI , Yusuke SHINOHARA

IPC分类号： G10L15/06 , G10L15/16

CPC分类号： G10L15/063 , G10L15/16 , G10L2015/0635

摘要： The present invention estimates intention of an utterance more accurately than the related arts. A learning device learns an estimation model on the basis of learning data including an acoustic signal for learning and a label indicating whether or not the acoustic signal has been uttered to a predetermined target. The learning device includes: a feature synchronization unit configured to obtain a post-synchronization feature by synchronizing an acoustic feature obtained from the acoustic signal for learning with a text feature corresponding to the acoustic signal; an utterance intention estimation unit configured to estimate whether or not the acoustic signal has been uttered to the predetermined target by using the post-synchronization feature; and a parameter update unit configured to update a parameter of the estimation model on the basis of the label included in the learning data and an estimation result by the utterance intention estimation unit.

66.

发明授权
Training a user-system dialog in a task-oriented dialog system 有权

公开(公告)号：US11961509B2

公开(公告)日：2024-04-16

申请号：US16839308

申请日：2020-04-03

申请人： Microsoft Technology Licensing, LLC

发明人： Swadheen Kumar Shukla , Lars Hasso Liden , Thomas Park , Matthew David Mazzola , Shahin Shayandeh , Jianfeng Gao , Eslam Kamal Abdelreheem

IPC分类号： G10L15/00 , G06N3/044 , G06N3/049 , G06N3/08 , G10L15/06 , G10L15/16 , G10L15/22 , G10L25/30

CPC分类号： G10L15/063 , G06N3/044 , G06N3/049 , G06N3/08 , G10L15/16 , G10L15/22 , G10L25/30 , G10L2015/0635 , G10L2015/225

摘要： Methods and systems are disclosed for improving dialog management for task-oriented dialog systems. The disclosed dialog builder leverages machine teaching processing to improve development of dialog managers. In this way, the dialog builder combines the strengths of both rule-based and machine-learned approaches to allow dialog authors to: (1) import a dialog graph developed using popular dialog composers, (2) convert the dialog graph to text-based training dialogs, (3) continuously improve the trained dialogs based on log dialogs, and (4) generate a corrected dialog for retraining the machine learning.

67.

发明公开
MACHINE LEARNING METHOD FOR ENHANCING CARE OF A PATIENT USING VIDEO AND AUDIO ANALYTICS 审中-公开

公开(公告)号：US20240120108A1

公开(公告)日：2024-04-11

申请号：US18463685

申请日：2023-09-08

申请人： Insight Direct USA, Inc.

发明人： Michael Griffin , Hailey Kotvis , Josephine Miner , Porter Moody , Kayla Poulsen , Austin Malmin , Sarah Onstad-Hawes , Gloria Solovey , Austin Streitmatter

IPC分类号： G16H50/30 , G06T7/00 , G06V10/774 , G06V20/40 , G10L15/02 , G10L15/06 , G10L15/18 , G10L25/66

CPC分类号： G16H50/30 , G06T7/0012 , G06V10/774 , G06V20/41 , G06V20/46 , G10L15/02 , G10L15/063 , G10L15/1815 , G10L25/66 , G16H10/60

摘要： Apparatus and associated methods relate to enhancing care of a patient using video and audio analytics. Video data, audio data, and semantic text data are extracted from a video stream of the patient. The video data are analyzed to identify a first feature set. The audio data are analyzed to identify a second feature set. The semantic text data are analyzed to identify a third feature set. Using a computer-implemented machine-learning model, a health outcome of the patient is predicted based on the first, second, and/or third features sets. The health outcome that is predicted is compared with the set of health outcomes of the training patients classified with the patient classification of the patient. Differences are identified between the feature sets corresponding to the patient and feature sets of the training patients who have better health outcomes the patient's predicted health outcome. The differences identified are then reported.

68.

发明授权
Phrase extraction for ASR models 有权

公开(公告)号：US11955134B2

公开(公告)日：2024-04-09

申请号：US17643848

申请日：2021-12-13

申请人： Google LLC

发明人： Ehsan Amid , Om Thakkar , Rajiv Mathews , Francoise Beaufays

IPC分类号： G10L21/0332 , G10L15/06 , G10L15/08 , G10L21/10

CPC分类号： G10L21/0332 , G10L15/063 , G10L15/08 , G10L21/10

摘要： A method of phrase extraction for ASR models includes obtaining audio data characterizing an utterance and a corresponding ground-truth transcription of the utterance and modifying the audio data to obfuscate a particular phrase recited in the utterance. The method also includes processing, using a trained ASR model, the modified audio data to generate a predicted transcription of the utterance, and determining whether the predicted transcription includes the particular phrase by comparing the predicted transcription of the utterance to the ground-truth transcription of the utterance. When the predicted transcription includes the particular phrase, the method includes generating an output indicating that the trained ASR model leaked the particular phrase from a training data set used to train the ASR model.

69.

发明公开
IDENTIFYING AND CORRECTING AUTOMATIC SPEECH RECOGNITION (ASR) MISRECOGNITIONS IN A DECENTRALIZED MANNER 审中-公开

公开(公告)号：US20240112673A1

公开(公告)日：2024-04-04

申请号：US17958887

申请日：2022-10-03

申请人： GOOGLE LLC

发明人： Rajiv Mathews , Rohit Prabhavalkar , Giovanni Motta , Mingqing Chen , Lillian Zhou , Dhruv Guliani , Harry Zhang , Trevor Strohman , Françoise Beaufays

IPC分类号： G10L15/197 , G10L15/06 , G10L15/22 , G10L15/30

CPC分类号： G10L15/197 , G10L15/063 , G10L15/22 , G10L15/30 , G10L2015/0635

摘要： Implementations described herein identify and correct automatic speech recognition (ASR) misrecognitions. For example, on-device processor(s) of a client device may generate a predicted textual segment that is predicted to correspond to spoken utterance of a user of the client device, and may receive further input that modifies the predicted textual segment to an alternate textual segment. Further, the on-device processor(s) may store these textual segments in on-device storage as a candidate correction pair, and transmit the candidate correction pair to a remote system. Moreover, remote processor(s) of the remote system may determine that the candidate correction pair is an actual correction pair, and may cause client devices to generate updates for a global ASR model for the candidate correction pair. Additionally, the remote processor(s) may distribute the global ASR model to the client devices and/or additional client devices.

70.

发明授权
Electronic device and control method therefor 有权

公开(公告)号：US11948567B2

公开(公告)日：2024-04-02

申请号：US17418314

申请日：2019-10-04

申请人： SAMSUNG ELECTRONICS CO., LTD.

发明人： Jangho Jin , Jaehyun Bae

IPC分类号： G10L15/22 , G10L15/04 , G10L15/06 , G10L15/18

CPC分类号： G10L15/22 , G10L15/04 , G10L15/063 , G10L15/1822 , G10L2015/223 , G10L2015/227

摘要： The present disclosure provides an electronic device and a control method therefor. The electronic device of the present disclosure comprises: a voice reception unit; and a processor for, when a first user voice and a second user voice are received through the voice reception unit, determining whether the second user voice corresponds to a candidate of utterance subsequent to the first user voice on the basis of a result obtained by dividing a plurality of attributes of the second user voice according to a predefined attribute, and controlling the electronic device to perform an operation corresponding to the second user voice on the basis of the intent of the second user voice obtained through a result of the determination.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类