Patent search ap:("GOOGLE LLC") AND inv:"Petar Aleksic" Page 5

41.

发明公开
SERVER SIDE HOTWORDING 审中-公开

公开(公告)号：US20230343340A1

公开(公告)日：2023-10-26

申请号：US18345077

申请日：2023-06-30

Applicant: GOOGLE LLC

Inventor： Alexander H. Gruenstein , Petar Aleksic , Johan Schalkwyk , Pedro J. Moreno Mengibar

IPC: G10L15/30 , G10L15/32 , G10L15/26

CPC classification number: G10L15/30 , G10L15/32 , G10L15/26 , G10L15/183

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

42.

发明授权
Speech processing 有权

公开(公告)号：US11138968B2

公开(公告)日：2021-10-05

申请号：US16696111

申请日：2019-11-26

Applicant: Google LLC

Inventor： Petar Aleksic , Benjamin Paul Hillson Haynor

IPC: G10L15/00 , G06F40/40 , G10L15/06 , G06F40/45 , G10L15/187 , G10L15/197

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for adapting a language model are disclosed. In one aspect, a method includes the actions of receiving transcriptions of utterances that were received by computing devices operating in a domain and that are in a source language. The actions further include generating translated transcriptions of the transcriptions of the utterances in a target language. The actions further include receiving a language model for the target language. The actions further include biasing the language model for the target language by increasing the likelihood of the language model selecting terms included in the translated transcriptions. The actions further include generating a transcription of an utterance in the target language using the biased language model and while operating in the domain.

43.

发明申请
SCALABLE DYNAMIC CLASS LANGUAGE MODELING 有权

公开(公告)号：US20210166682A1

公开(公告)日：2021-06-03

申请号：US17172600

申请日：2021-02-10

Applicant: Google LLC

Inventor： Justin Max Scheiner , Petar Aleksic

IPC: G10L15/197 , G06F16/683 , G06F16/33 , G06F40/289 , G10L15/22 , G10L15/30

Abstract: This document generally describes systems and methods for dynamically adapting speech recognition for individual voice queries of a user using class-based language models. The method may include receiving a voice query from a user that includes audio data corresponding to an utterance of the user, and context data associated with the user. One or more class models are then generated that collectively identify a first set of terms determined based on the context data, and a respective class to which the respective term is assigned for each respective term in the first set of terms. A language model that includes a residual unigram may then be accessed and processed for each respective class to insert a respective class symbol at each instance of the residual unigram that occurs within the language model. A transcription of the utterance of the user is then generated using the modified language model.

44.

发明授权
Contextual denormalization for automatic speech recognition 有权

公开(公告)号：US10789955B2

公开(公告)日：2020-09-29

申请号：US16192953

申请日：2018-11-16

Applicant: Google LLC

Inventor： Assaf Hurwitz Michaely , Petar Aleksic , Pedro Moreno

IPC: G10L15/26 , G10L15/22 , G06F40/56

Abstract: A method includes receiving a speech input from a user and obtaining context metadata associated with the speech input. The method also includes generating a raw speech recognition result corresponding to the speech input and selecting a list of one or more denormalizers to apply to the generated raw speech recognition result based on the context metadata associated with the speech input. The generated raw speech recognition result includes normalized text. The method also includes denormalizing the generated raw speech recognition result into denormalized text by applying the list of the one or more denormalizers in sequence to the generated raw speech recognition result.

45.

发明申请
LANGUAGE MODEL BIASING MODULATION 审中-公开

公开(公告)号：US20200302916A1

公开(公告)日：2020-09-24

申请号：US16896779

申请日：2020-06-09

Applicant: Google LLC

Inventor： Pedro J. Moreno Mengibar , Petar Aleksic

IPC: G10L15/07 , G10L15/197 , G10L15/183 , G10L15/24

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for modulating language model biasing. In some implementations, context data is received. A likely context associated with a user is determined based on at least a portion of the context data. One or more language model biasing parameters based at least on the likely context associated with the user is selected. A context confidence score associated with the likely context based on at least a portion of the context data is determined. One or more language model teasing parameters based at least on the context confidence score is adjusted. A baseline language model based at least on the one or more of the adjusted language model biasing parameters is biased. The baseline language model is provided for use by an automated speech recognizer (ASR).

46.

发明申请
Voice Recognition System 审中-公开

公开(公告)号：US20200227046A1

公开(公告)日：2020-07-16

申请号：US16837250

申请日：2020-04-01

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro J. Moreno Mengibar

IPC: G10L15/26 , G06F16/632 , G10L15/19 , G10L15/197 , G10L15/04

Abstract: Methods, systems; and apparatus, including computer programs encoded on a computer storage medium, for voice recognition. In one aspect, a method includes the actions of receiving a voice input; determining a transcription for the voice input, wherein determining the transcription for the voice input includes, for a plurality of segments of the voice input: obtaining a first candidate transcription for a first segment of the voice input; determining one or more contexts associated with the first candidate transcription; adjusting a respective weight for each of the one or more contexts; and determining a second candidate transcription for a second segment of the voice input based in part on the adjusted weights; and providing the transcription of the plurality of segments of the voice input for output.

47.

发明申请
ALLOWING SPELLING OF ARBITRARY WORDS 审中-公开

公开(公告)号：US20200168212A1

公开(公告)日：2020-05-28

申请号：US16751215

申请日：2020-01-24

Applicant: Google LLC

Inventor： Evgeny A Cherepanov , Gleb Skobeltsyn , Jakob Nicolaus Foerster , Petar Aleksic , Assaf Avner Hurwitz Michaely

IPC: G10L15/19 , G10L15/22 , G06F3/16 , G10L15/30

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for natural language processing One of the methods includes receiving a first voice input from a user device, generating a first recognition output, receiving a user selection of one or more terms in the first recognition output- receiving a second voice input spelling a correction of the user selection, determining a corrected recognition output for the selected portion; and providing a second recognition output that merges the first recognition output and the corrected recognition output.

48.

发明申请
SERVER SIDE HOTWORDING 审中-公开

公开(公告)号：US20190304465A1

公开(公告)日：2019-10-03

申请号：US16392829

申请日：2019-04-24

Applicant: Google LLC

Inventor： Alexander H. Gruenstein , Petar Aleksic , Johan Schalkwyk , Pedro J. Moreno Mengibar

IPC: G10L15/30 , G10L15/32 , G10L15/26

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting hotwords using a server. One of the methods includes receiving an audio signal encoding one or more utterances including a first utterance; determining whether at least a portion of the first utterance satisfies a first threshold of being at least a portion of a key phrase; in response to determining that at least the portion of the first utterance satisfies the first threshold of being at least a portion of a key phrase, sending the audio signal to a server system that determines whether the first utterance satisfies a second threshold of being the key phrase, the second threshold being more restrictive than the first threshold; and receiving tagged text data representing the one or more utterances encoded in the audio signal when the server system determines that the first utterance satisfies the second threshold.

49.

发明授权
Recognizing speech with mixed speech recognition models to generate transcriptions 有权

公开(公告)号：US10354650B2

公开(公告)日：2019-07-16

申请号：US13838379

申请日：2013-03-15

Applicant: Google LLC

Inventor： Alexander H. Gruenstein , Petar Aleksic

IPC: G10L15/26 , G10L15/18 , G10L15/22 , G10L15/32 , G10L15/193 , G10L15/30 , G10L15/197

Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.

50.

发明授权
Voice recognition system 有权

公开(公告)号：US10269354B2

公开(公告)日：2019-04-23

申请号：US15910872

申请日：2018-03-02

Applicant: Google LLC

Inventor： Petar Aleksic , Pedro J. Moreno Mengibar

IPC: G10L15/22 , G10L15/26 , G10L15/19 , G10L15/197 , G06F17/30 , G10L15/04 , G10L15/08 , G10L15/183

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for voice recognition. In one aspect, a method includes the actions of receiving a voice input; determining a transcription for the voice input, wherein determining the transcription for the voice input includes, for a plurality of segments of the voice input: obtaining a first candidate transcription for a first segment of the voice input; determining one or more contexts associated with the first candidate transcription; adjusting a respective weight for each of the one or more contexts; and determining a second candidate transcription for a second segment of the voice input based in part on the adjusted weights; and providing the transcription of the plurality of segments of the voice input for output.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification