Patent search ap:("GOOGLE LLC") AND inv:"Petar Aleksic" Page 9

81.

发明授权
Language model biasing system 有权

公开(公告)号：US10311860B2

公开(公告)日：2019-06-04

申请号：US15432620

申请日：2017-02-14

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro J. Moreno Mengibar

IPC: G10L15/00 , G10L15/07 , G10L15/187 , G10L15/18 , G10L15/197 , G10L15/30 , G10L15/01

Abstract: Methods, systems, and apparatus for receiving audio data corresponding to a user utterance and context data, identifying an initial set of one or more n-grams from the context data, generating an expanded set of one or more n-grams based on the initial set of n-grams, adjusting a language model based at least on the expanded set of n-grams, determining one or more speech recognition candidates for at least a portion of the user utterance using the adjusted language model, adjusting a score for a particular speech recognition candidate determined to be included in the expanded set of n-grams, determining a transcription of user utterance that includes at least one of the one or more speech recognition candidates, and providing the transcription of the user utterance for output.

82.

发明授权
Language model biasing modulation 有权

公开(公告)号：US10297248B2

公开(公告)日：2019-05-21

申请号：US15874075

申请日：2018-01-18

Applicant: Google LLC

Inventor： Pedro J. Moreno Mengibar , Petar Aleksic

IPC: G10L21/00 , G10L15/07 , G10L15/197 , G10L15/183 , G10L15/24

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

83.

发明授权
Language model biasing modulation 有权

公开(公告)号：US12230251B2

公开(公告)日：2025-02-18

申请号：US18064917

申请日：2022-12-12

Applicant: Google LLC

Inventor： Pedro J. Moreno Mengibar , Petar Aleksic

IPC: G10L15/00 , G10L15/07 , G10L15/183 , G10L15/197 , G10L15/24

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model biasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

84.

发明申请
CONTEXTUAL TAGGING AND BIASING OF GRAMMARS INSIDE WORD LATTICES 有权

公开(公告)号：US20240428785A1

公开(公告)日：2024-12-26

申请号：US18824716

申请日：2024-09-04

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro J. Moreno Mengibar , Leonid Velikovich

IPC: G10L15/197 , G10L15/16 , G10L15/18 , G10L15/187

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for implementing contextual grammar selection are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance. The actions include generating a word lattice that includes multiple candidate transcriptions of the utterance and that includes transcription confidence scores. The actions include determining a context of the computing device. The actions include based on the context of the computing device, identifying grammars that correspond to the multiple candidate transcriptions. The actions include determining, for each of the multiple candidate transcriptions, grammar confidence scores that reflect a likelihood that a respective grammar is a match for a respective candidate transcription. The actions include selecting, from among the candidate transcriptions, a candidate transcription. The actions further include providing, for output, the selected candidate transcription as a transcription of the utterance.

85.

发明授权
Determining dialog states for language models 有权

公开(公告)号：US12080290B2

公开(公告)日：2024-09-03

申请号：US17650567

申请日：2022-02-10

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro Jose Moreno Mengibar

IPC: G10L15/22 , G06F40/295 , G06F40/30 , G10L15/26 , G10L15/065 , G10L15/183 , G10L15/197

CPC classification number: G10L15/22 , G06F40/295 , G06F40/30 , G10L15/26 , G10L15/065 , G10L15/183 , G10L15/197

Abstract: Systems, methods, devices, and other techniques are described herein for determining dialog states that correspond to voice inputs and for biasing a language model based on the determined dialog states. In some implementations, a method includes receiving, at a computing system, audio data that indicates a voice input and determining a particular dialog state, from among a plurality of dialog states, which corresponds to the voice input. A set of n-grams can be identified that are associated with the particular dialog state that corresponds to the voice input. In response to identifying the set of n-grams that are associated with the particular dialog state that corresponds to the voice input, a language model can be biased by adjusting probability scores that the language model indicates for n-grams in the set of n-grams. The voice input can be transcribed using the adjusted language model.

86.

发明公开
ALPHANUMERIC SEQUENCE BIASING FOR AUTOMATIC SPEECH RECOGNITION 审中-公开

公开(公告)号：US20240233732A1

公开(公告)日：2024-07-11

申请号：US18615621

申请日：2024-03-25

Applicant: GOOGLE LLC

Inventor： Benjamin Haynor , Petar Aleksic

IPC: G10L15/26 , G10L15/16 , G10L15/193 , G10L15/22 , G10L15/30

CPC classification number: G10L15/26 , G10L15/16 , G10L15/193 , G10L15/22 , G10L15/30

Abstract: Speech processing techniques are disclosed that enable determining a text representation of alphanumeric sequences in captured audio data. Various implementations include determining a contextual biasing finite state transducer (FST) based on contextual information corresponding to the captured audio data. Additional or alternative implementations include modifying probabilities of one or more candidate recognitions of the alphanumeric sequence using the contextual biasing FST.

87.

发明授权
Voice to text conversion based on third-party agent content 有权

公开(公告)号：US11922945B2

公开(公告)日：2024-03-05

申请号：US18125606

申请日：2023-03-23

Applicant: GOOGLE LLC

Inventor： Barnaby James , Bo Wang , Sunil Vemuri , David Schairer , Ulas Kirazci , Ertan Dogrultan , Petar Aleksic

IPC: G10L15/26 , G06F40/205 , G06F40/284 , G06F40/30 , G10L15/18 , G10L15/183 , G10L15/22 , G10L15/30

CPC classification number: G10L15/26 , G06F40/205 , G06F40/284 , G06F40/30 , G10L15/1815 , G10L15/183 , G10L15/22 , G10L15/30 , G10L2015/223 , G10L2015/228

Abstract: Implementations relate to dynamically, and in a context-sensitive manner, biasing voice to text conversion. In some implementations, the biasing of voice to text conversions is performed by a voice to text engine of a local agent, and the biasing is based at least in part on content provided to the local agent by a third-party (3P) agent that is in network communication with the local agent. In some of those implementations, the content includes contextual parameters that are provided by the 3P agent in combination with responsive content generated by the 3P agent during a dialog that: is between the 3P agent, and a user of a voice-enabled electronic device; and is facilitated by the local agent. The contextual parameters indicate potential feature(s) of further voice input that is to be provided in response to the responsive content generated by the 3P agent.

88.

发明公开
SCALABLE DYNAMIC CLASS LANGUAGE MODELING 审中-公开

公开(公告)号：US20240054998A1

公开(公告)日：2024-02-15

申请号：US18486145

申请日：2023-10-12

Applicant: Google LLC

Inventor： Justin Max Scheiner , Petar Aleksic

IPC: G10L15/197 , G06F16/683 , G06F16/33 , G06F40/289 , G10L15/22 , G10L15/30

CPC classification number: G10L15/197 , G06F16/683 , G06F16/3344 , G06F40/289 , G10L15/22 , G10L15/30 , G10L15/1815

Abstract: This document generally describes systems and methods for dynamically adapting speech recognition for individual voice queries of a user using class-based language models. The method may include receiving a voice query from a user that includes audio data corresponding to an utterance of the user, and context data associated with the user. One or more class models are then generated that collectively identify a first set of terms determined based on the context data, and a respective class to which the respective term is assigned for each respective term in the first set of terms. A language model that includes a residual unigram may then be accessed and processed for each respective class to insert a respective class symbol at each instance of the residual unigram that occurs within the language model. A transcription of the utterance of the user is then generated using the modified language model.

89.

发明授权
Word lattice augmentation for automatic speech recognition 有权

公开(公告)号：US11797772B2

公开(公告)日：2023-10-24

申请号：US17589186

申请日：2022-01-31

Applicant: GOOGLE LLC

Inventor： Leonid Velikovich , Petar Aleksic , Pedro Moreno

IPC: G10L15/22 , G10L15/187 , G06F40/295 , G06F40/30 , G10L15/06

CPC classification number: G06F40/295 , G06F40/30 , G10L15/063 , G10L15/187 , G10L15/22

Abstract: Speech processing techniques are disclosed that enable determining a text representation of named entities in captured audio data. Various implementations include determining the location of a carrier phrase in a word lattice representation of the captured audio data, where the carrier phrase provides an indication of a named entity. Additional or alternative implementations include matching a candidate named entity with the portion of the word lattice, and augmenting the word lattice with the matched candidate named entity.

90.

发明公开
LANGUAGE MODEL BIASING SYSTEM 审中-公开

公开(公告)号：US20230290339A1

公开(公告)日：2023-09-14

申请号：US18318495

申请日：2023-05-16

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro J. Moreno Mengibar

IPC: G10L15/07 , G10L15/187 , G10L15/18 , G10L15/197

CPC classification number: G10L15/07 , G10L15/187 , G10L15/1815 , G10L15/197 , G10L15/30

Abstract: Methods, systems, and apparatus for receiving audio data corresponding to a user utterance and context data, identifying an initial set of one or more n-grams from the context data, generating an expanded set of one or more n-grams based on the initial set of n-grams, adjusting a language model based at least on the expanded set of n-grams, determining one or more speech recognition candidates for at least a portion of the user utterance using the adjusted language model, adjusting a score for a particular speech recognition candidate determined to be included in the expanded set of n-grams, determining a transcription of user utterance that includes at least one of the one or more speech recognition candidates, and providing the transcription of the user utterance for output.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification