Patent search ipc:"G10L15/19" Page 1

1.

发明授权
Systems and methods for formatting informal utterances 有权

公开(公告)号：US12198685B2

公开(公告)日：2025-01-14

申请号：US18184432

申请日：2023-03-15

Applicant: PAYPAL, INC.

Inventor： Sandro Cavallari , Yuzhen Zhuo , Van Hoang Nguyen , Quan Jin Ferdinand Tang , Gautam Vasappanavara

IPC: G06F40/00 , G06F40/205 , G06F40/253 , G06F40/284 , G10L15/19 , G10L15/26

Abstract: Methods and systems are presented for translating informal utterances into formal texts. Informal utterances may include words in abbreviation forms or typographical errors. The informal utterances may be processed by mapping each word in an utterance into a well-defined token. The mapping from the words to the tokens may be based on a context associated with the utterance derived by analyzing the utterance in a character-by-character basis. The token that is mapped for each word can be one of a vocabulary token that corresponds to a formal word in a pre-defined word corpus, an unknown token that corresponds to an unknown word, or a masked token. Formal text may then be generated based on the mapped tokens. Through the processing of informal utterances using the techniques disclosed herein, the informal utterances are both normalized and sanitized.

2.

发明申请
SYSTEMS AND METHODS FOR TRAINING VOICE QUERY MODELS 有权

公开(公告)号：US20240428779A1

公开(公告)日：2024-12-26

申请号：US18823383

申请日：2024-09-03

Applicant: Comcast Cable Communications, LLC

Inventor： WENYAN LI , FERHAN TURE , JOSE CASILLAS , GEORGE THOMAS DES JARDINS

IPC: G10L15/06 , G06F40/169 , G06N20/00 , G10L15/19 , G10L15/22 , G10L25/63

Abstract: Methods for automatically evaluating ASR outputs and providing annotations, including corrections, on the transcriptions—in order to improve recognition—may be based on an analysis of sessions of user voice queries, utilizing time-ordered ASR transcriptions of user voice queries (i.e., user utterances). This utterance-based approach may involve extracting both session-level and query-level characteristics from a voice query sessions and identifying patterns of query reformulation in order to detect erroneous transcriptions and automatically determine an appropriate correction. Alternative, or in addition, ASR outputs may be evaluated based on user behavior. The outcomes may be classified as positive or negative. An ASR transcription may be labeled using the description of the outcome. The labeled transcription may be used as training data to train a model to output improved transcriptions of voice queries.

3.

发明申请
METHOD, APPARATUS, AND COMPUTER-READABLE RECORDING MEDIUM FOR CONTROLLING RESPONSE UTTERANCE BEING REPRODUCED AND PREDICTING USER INTENTION 有权

公开(公告)号：US20240379097A1

公开(公告)日：2024-11-14

申请号：US18464240

申请日：2023-09-10

Applicant: Saltlux Inc.

Inventor： Kyung Il LEE , Jong Won LEE

IPC: G10L15/18 , G10L15/02 , G10L15/04 , G10L15/19 , G10L15/22

Abstract: A method for controlling a response utterance being reproduced and predicting a user intention includes: a voice signal analysis step of, when a second voice signal is received from a user while a first response utterance for responding to a first voice signal is output, starting analysis on the second voice signal; an utterance control step of, when one of preset keywords is identified to be included in a first sentence corresponding to the second voice signal, controlling the first response utterance being output to correspond to the identified keyword; an intention prediction step of, when a third voice signal is received in a state where the first response utterance is controlled, analyzing the third voice signal to predict an intention of the user, which is reflected in a second sentence corresponding to the third voice signal; and a response utterance output step of outputting a response utterance.

4.

发明申请
AUTOMATIC LEARNING OF ENTITIES, WORDS, PRONUNCIATIONS, AND PARTS OF SPEECH 有权

公开(公告)号：US20240379092A1

公开(公告)日：2024-11-14

申请号：US18783423

申请日：2024-07-25

Applicant: SoundHound AI IP, LLC.

Inventor： Anton V. RELIN

IPC: G10L15/02 , G10L15/14 , G10L15/19

Abstract: Systems for automatic speech recognition and/or natural language understanding automatically learn new words by finding subsequences of phonemes that, if they were a new word, would enable a successful tokenization of a phoneme sequence. Systems can learn alternate pronunciations of words by finding phoneme sequences with a small edit distance to existing pronunciations. Systems can learn the part of speech of words by finding part-of-speech variations that would enable parses by syntactic grammars. Systems can learn what types of entities a word describes by finding sentences that could be parsed by a semantic grammar but for the words not being on an entity list.

5.

发明授权
Methods and systems for automatic call data generation 有权

公开(公告)号：US12014144B2

公开(公告)日：2024-06-18

申请号：US17390573

申请日：2021-07-30

Applicant: INTUIT INC.

Inventor： Zhewen Fan , Byungkyu Kang , Wan Yu Zhang , Carlos A. Oliveira , Wenxin Xiao

IPC: G06F40/30 , G06F16/38 , G06F18/22 , G06F40/279 , G10L15/19 , G10L15/22 , H04M3/51

CPC classification number: G06F40/30 , G06F16/38 , G06F18/22 , G06F40/279 , G10L15/19 , G10L15/22 , H04M3/51

Abstract: A processor may receive a call transcript including text and form a text string including at least a portion of the text. The processor may generate a situation description of the call transcript, which may comprise processing the text string using a transformer-based machine learning model. The processor may generate a trouble description of the call transcript, which may comprise creating a sentence embedding of the situation description, creating sentence embeddings for a plurality of utterances within the portion of the text, determining respective similarities between the sentence embedding of the situation description and each of the sentence embeddings for each respective one of the plurality of utterances, and selecting at least one of the plurality of utterances having at least one highest determined respective similarity as the trouble description. The processor may store a call summary comprising the situation description and the trouble description in a non-transitory memory.

6.

发明公开
TRAINING SPEECH RECOGNITION SYSTEMS USING WORD SEQUENCES 审中-公开

公开(公告)号：US20240127798A1

公开(公告)日：2024-04-18

申请号：US18538957

申请日：2023-12-13

Applicant: Sorenson IP Holdings, LLC

Inventor： David Thomson

IPC: G10L15/065 , G06F21/60 , G10L15/06 , G10L15/19 , G10L25/51 , H04L9/08

CPC classification number: G10L15/065 , G06F21/602 , G10L15/063 , G10L15/19 , G10L25/51 , H04L9/0869

Abstract: A method may include obtaining a text string that is a transcription of audio data and selecting a sequence of words from the text string as a first word sequence. The method may further include encrypting the first word sequence and comparing the encrypted first word sequence to multiple encrypted word sequences. Each of the multiple encrypted word sequences may be associated with a corresponding one of multiple counters. The method may also include in response to the encrypted first word sequence corresponding to one of the multiple encrypted word sequences based on the comparison, incrementing a counter of the multiple counters associated with the one of the multiple encrypted word sequences and adapting a language model of an automatic transcription system using the multiple encrypted word sequences and the multiple counters.

7.

发明公开
MINUTES OF MEETING PROCESSING METHOD AND APPARATUS, DEVICE, AND MEDIUM 审中-公开

公开(公告)号：US20240079002A1

公开(公告)日：2024-03-07

申请号：US18262400

申请日：2022-01-05

Applicant: BEIJING ZITIAO NETWORK TECHNOLOGY CO., LTD.

Inventor： Chunsai DU , Jingsheng YANG , Kojung CHEN , Xiang ZHENG , Wenming XU

IPC: G10L15/19 , G06F3/16 , G06Q10/10 , G10L15/04 , G10L15/06 , G10L15/22

CPC classification number: G10L15/19 , G06F3/165 , G06Q10/10 , G10L15/04 , G10L15/063 , G10L15/22

Abstract: A minutes of meeting processing method, a device, and a medium. The method comprises: acquiring meeting text of a meeting audio/video; inputting the meeting text into a to-do identification model, and determining initial to-do statements; inputting the initial to-do statements into a tense determination model, and determining tense results of the initial to-do statements; and determining a meeting to-do statement in the initial to-do statements on the basis of the tense results.

8.

发明授权
Multi-session context 有权

公开(公告)号：US11908463B1

公开(公告)日：2024-02-20

申请号：US17361761

申请日：2021-06-29

Applicant: Amazon Technologies, Inc.

Inventor： Arjit Biswas , Shishir Bharathi , Anushree Venkatesh , Yun Lei , Ashish Kumar Agrawal , Siddhartha Reddy Jonnalagadda , Prakash Krishnan , Arindam Mandal , Raefer Christopher Gabriel , Abhay Kumar Jha , David Chi-Wai Tang , Savas Parastatidis

IPC: G10L15/22 , G06F40/35 , G10L15/183 , G10L15/18 , G06F40/279 , G06F40/295 , G10L15/19 , G06F40/30

CPC classification number: G10L15/183 , G06F40/279 , G10L15/1815 , G10L15/22 , G06F40/295 , G06F40/30 , G06F40/35 , G10L15/1822 , G10L15/19 , G10L2015/228

Abstract: Techniques for storing and using multi-session context are described. A system may store context data corresponding to a first interaction, where the context data may include action data, entity data and a profile identifier for a user. Later the stored context data may be retrieved during a second interaction corresponding to the entity of the second interaction. The second interaction may take place at a system different than the first interaction. The system may generate a response during the second interaction using the stored context data of the prior interaction.

9.

发明授权
Systems and methods for scripted audio production 有权

公开(公告)号：US11875797B2

公开(公告)日：2024-01-16

申请号：US17355023

申请日：2021-06-22

Applicant: Pozotron Inc.

Inventor： Jakub Poznanski , Kostiantyn Hlushak

IPC: G10L15/26 , G10L15/08 , G10L15/19 , G10L15/187 , G06F3/04842 , G10L15/06

CPC classification number: G10L15/26 , G06F3/04842 , G10L15/063 , G10L15/083 , G10L15/187 , G10L15/19

Abstract: A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.

10.

发明公开
STREAMING PUNCTUATION FOR LONG-FORM DICTATION 审中-公开

公开(公告)号：US20230352009A1

公开(公告)日：2023-11-02

申请号：US17732971

申请日：2022-04-29

Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventor： Piyush BEHRE , Sharman W TAN , Shuangyu CHANG , Padma VARADHARAJAN , Sayan Dev PATHAK , Ravikant GUPTA

IPC: G10L15/19 , G10L15/04 , G10L15/22 , G06F40/58 , G06F40/103

CPC classification number: G10L15/19 , G10L15/04 , G10L15/22 , G06F40/58 , G06F40/103

Abstract: Systems generate segments of spoken language utterances based on different sets of segmentation boundaries. The systems are also configured to generate one or more formatted segments by assigning a punctuation tags at segmentation boundaries and to generate one or more final sentences from the one or more segments.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification