Patent search ap:("GOOGLE LLC") AND inv:"Petar Aleksic" Page 10

91.

发明授权
Server side hotwording 有权

公开(公告)号：US11699443B2

公开(公告)日：2023-07-11

申请号：US17337182

申请日：2021-06-02

Applicant: GOOGLE LLC

Inventor： Alexander H. Gruenstein , Petar Aleksic , Johan Schalkwyk , Pedro J. Moreno Mengibar

IPC: G10L15/30 , G10L15/32 , G10L15/26 , G10L15/183 , G10L15/22 , G10L15/08

CPC classification number: G10L15/30 , G10L15/26 , G10L15/32 , G10L15/183 , G10L2015/088 , G10L2015/223

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

92.

发明授权
Speech processing 有权

公开(公告)号：US11676577B2

公开(公告)日：2023-06-13

申请号：US17447282

申请日：2021-09-09

Applicant: Google LLC

Inventor： Petar Aleksic , Benjamin Paul Hillson Haynor

IPC: G06F40/30 , G10L15/00 , G10L15/07 , G10L15/24 , G10L15/06 , G06F40/45 , G10L15/187 , G10L15/197

CPC classification number: G10L15/063 , G06F40/45 , G10L15/005 , G10L15/187 , G10L15/197

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for adapting a language model are disclosed. In one aspect, a method includes the actions of receiving transcriptions of utterances that were received by computing devices operating in a domain and that are in a source language. The actions further include generating translated transcriptions of the transcriptions of the utterances in a target language. The actions further include receiving a language model for the target language. The actions further include biasing the language model for the target language by increasing the likelihood of the language model selecting terms included in the translated transcriptions. The actions further include generating a transcription of an utterance in the target language using the biased language model and while operating in the domain.

93.

发明授权
Voice to text conversion based on third-party agent content 有权

公开(公告)号：US11626115B2

公开(公告)日：2023-04-11

申请号：US17582926

申请日：2022-01-24

Applicant: Google LLC

Inventor： Barnaby James , Bo Wang , Sunil Vemuri , David Schairer , Ulas Kirazci , Ertan Dogrultan , Petar Aleksic

IPC: G10L15/26 , G10L15/22 , G06F40/284 , G06F40/205 , G06F40/30 , G10L15/183 , G10L15/18 , G10L15/30

Abstract: Implementations relate to dynamically, and in a context-sensitive manner, biasing voice to text conversion. In some implementations, the biasing of voice to text conversions is performed by a voice to text engine of a local agent, and the biasing is based at least in part on content provided to the local agent by a third-party (3P) agent that is in network communication with the local agent. In some of those implementations, the content includes contextual parameters that are provided by the 3P agent in combination with responsive content generated by the 3P agent during a dialog that: is between the 3P agent, and a user of a voice-enabled electronic device; and is facilitated by the local agent. The contextual parameters indicate potential feature(s) of further voice input that is to be provided in response to the responsive content generated by the 3P agent.

94.

发明授权
Language model biasing modulation 有权

公开(公告)号：US11532299B2

公开(公告)日：2022-12-20

申请号：US16896779

申请日：2020-06-09

Applicant: Google LLC

Inventor： Pedro J. Moreno Mengibar , Petar Aleksic

IPC: G10L15/00 , G10L15/07 , G10L15/197 , G10L15/183 , G10L15/24

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model teasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

95.

发明申请
SELECTIVELY INVOKING AN AUTOMATED ASSISTANT BASED ON DETECTED ENVIRONMENTAL CONDITIONS WITHOUT NECESSITATING VOICE-BASED INVOCATION OF THE AUTOMATED ASSISTANT 有权

公开(公告)号：US20220310089A1

公开(公告)日：2022-09-29

申请号：US17251481

申请日：2020-01-17

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro Jose Moreno Mengibar

IPC: G10L15/22 , G10L15/06 , G06F3/16 , G10L15/18

Abstract: Implementations set forth herein relate to an automated assistant that is invoked according to contextual signals—in lieu of requiring a user to explicitly speak an invocation phrase. When a user is in an environment with an assistant-enabled device, contextual data characterizing features of the environment can be processed to determine whether a user intends to invoke the automated assistant. Therefore, when such features are detected by the automated assistant, the automated assistant can bypass requiring an invocation phrase from a user and, instead, be responsive to one or more assistant commands from the user. The automated assistant can operate based on a trained machine learning model that is trained using instances of training data that characterize previous interactions in which one or more users invoked or did not invoke the automated assistant.

96.

发明申请
Contextual Denormalization For Automatic Speech Recognition 有权

公开(公告)号：US20220277749A1

公开(公告)日：2022-09-01

申请号：US17652923

申请日：2022-02-28

Applicant: Google LLC

Inventor： Assaf Hurwitz Michaely , Petar Aleksic , Pedro J. Moreno Mengibar

IPC: G10L15/26 , G06F40/56 , G10L15/22

Abstract: A method includes receiving a speech input from a user and obtaining context metadata associated with the speech input. The method also includes generating a raw speech recognition result corresponding to the speech input and selecting a list of one or more denormalizers to apply to the generated raw speech recognition result based on the context metadata associated with the speech input. The generated raw speech recognition result includes normalized text. The method also includes denormalizing the generated raw speech recognition result into denormalized text by applying the list of the one or more denormalizers in sequence to the generated raw speech recognition result.

97.

发明申请
DETERMINING DIALOG STATES FOR LANGUAGE MODELS 有权

公开(公告)号：US20220165270A1

公开(公告)日：2022-05-26

申请号：US17650567

申请日：2022-02-10

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro Jose Moreno Mengibar

IPC: G10L15/22 , G10L15/26 , G06F40/30 , G06F40/295

Abstract: Systems, methods, devices, and other techniques are described herein for determining dialog states that correspond to voice inputs and for biasing a language model based on the determined dialog states. In some implementations, a method includes receiving, at a computing system, audio data that indicates a voice input and determining a particular dialog state, from among a plurality of dialog states, which corresponds to the voice input. A set of n-grams can be identified that are associated with the particular dialog state that corresponds to the voice input. In response to identifying the set of n-grams that are associated with the particular dialog state that corresponds to the voice input, a language model can be biased by adjusting probability scores that the language model indicates for n-grams in the set of n-grams. The voice input can be transcribed using the adjusted language model.

98.

发明授权
Word lattice augmentation for automatic speech recognition 有权

公开(公告)号：US11238227B2

公开(公告)日：2022-02-01

申请号：US16622657

申请日：2019-06-27

Applicant: Google LLC

Inventor： Leonid Velikovich , Petar Aleksic , Pedro Moreno

IPC: G10L15/22 , G10L15/187 , G06F40/295 , G06F40/30 , G10L15/06

Abstract: Speech processing techniques are disclosed that enable determining a text representation of named entities in captured audio data. Various implementations include determining the location of a carrier phrase in a word lattice representation of the captured audio data, where the carrier phrase provides an indication of a named entity. Additional or alternative implementations include matching a candidate named entity with the portion of the word lattice, and augmenting the word lattice with the matched candidate named entity.

99.

发明申请
SPEECH PROCESSING 有权

公开(公告)号：US20210398519A1

公开(公告)日：2021-12-23

申请号：US17447282

申请日：2021-09-09

Applicant: Google LLC

Inventor： Petar Aleksic , Benjamin Paul Hillson Haynor

IPC: G10L15/06 , G06F40/45 , G10L15/00 , G10L15/187 , G10L15/197

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for adapting a language model are disclosed. In one aspect, a method includes the actions of receiving transcriptions of utterances that were received by computing devices operating in a domain and that are in a source language. The actions further include generating translated transcriptions of the transcriptions of the utterances in a target language. The actions further include receiving a language model for the target language. The actions further include biasing the language model for the target language by increasing the likelihood of the language model selecting terms included in the translated transcriptions. The actions further include generating a transcription of an utterance in the target language using the biased language model and while operating in the domain.

100.

发明申请
ALLOWING SPELLING OF ARBITRARY WORDS 有权

公开(公告)号：US20210350074A1

公开(公告)日：2021-11-11

申请号：US17443330

申请日：2021-07-24

Applicant: Google LLC

Inventor： Evgeny A. Cherepanov , Gleb Skobeltsyn , Jakob Nicolaus Foerster , Petar Aleksic , Assaf Avner Hurwitz Michaely

IPC: G06F40/232 , G10L15/32 , G10L15/26 , G10L15/197 , G10L15/187 , G10L15/22 , G06F3/16 , G10L15/19 , G10L15/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing. One of the methods includes receiving a first voice input from a user device; generating a first recognition output; receiving a user selection of one or more terms in the first recognition output; receiving a second voice input spelling a correction of the user selection; determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification