Language model biasing system
    5.
    发明授权

    公开(公告)号:US11682383B2

    公开(公告)日:2023-06-20

    申请号:US17337400

    申请日:2021-06-02

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus for receiving audio data corresponding to a user utterance and context data, identifying an initial set of one or more n-grams from the context data, generating an expanded set of one or more n-grams based on the initial set of n-grams, adjusting a language model based at least on the expanded set of n-grams, determining one or more speech recognition candidates for at least a portion of the user utterance using the adjusted language model, adjusting a score for a particular speech recognition candidate determined to be included in the expanded set of n-grams, determining a transcription of user utterance that includes at least one of the one or more speech recognition candidates, and providing the transcription of the user utterance for output.

    CROSS-LINGUAL SPEECH RECOGNITION
    6.
    发明申请

    公开(公告)号:US20220383862A1

    公开(公告)日:2022-12-01

    申请号:US17817176

    申请日:2022-08-03

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross-lingual speech recognition are disclosed. In one aspect, a method includes the actions of determining a context of a second computing device. The actions further include identifying, by a first computing device, an additional pronunciation for a term of multiple terms. The actions further include including the additional pronunciation for the term in the lexicon. The actions further include receiving audio data of an utterance. The actions further include generating a transcription of the utterance by using the lexicon that includes the multiple terms and the pronunciation for each of the multiple terms and the additional pronunciation for the term. The actions further include after generating the transcription of the utterance, removing the additional pronunciation for the term from the lexicon. The actions further include providing, for output, the transcription.

    Cross-lingual speech recognition
    7.
    发明授权

    公开(公告)号:US11437025B2

    公开(公告)日:2022-09-06

    申请号:US16593564

    申请日:2019-10-04

    Applicant: Google LLC

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for cross-lingual speech recognition are disclosed. In one aspect, a method includes the actions of determining a context of a second computing device. The actions further include identifying, by a first computing device, an additional pronunciation for a term of multiple terms. The actions further include including the additional pronunciation for the term in the lexicon. The actions further include receiving audio data of an utterance. The actions further include generating a transcription of the utterance by using the lexicon that includes the multiple terms and the pronunciation for each of the multiple terms and the additional pronunciation for the term. The actions further include after generating the transcription of the utterance, removing the additional pronunciation for the term from the lexicon. The actions further include providing, for output, the transcription.

    WORD LATTICE AUGMENTATION FOR AUTOMATIC SPEECH RECOGNITION

    公开(公告)号:US20220229992A1

    公开(公告)日:2022-07-21

    申请号:US17589186

    申请日:2022-01-31

    Applicant: GOOGLE LLC

    Abstract: Speech processing techniques are disclosed that enable determining a text representation of named entities in captured audio data. Various implementations include determining the location of a carrier phrase in a word lattice representation of the captured audio data, where the carrier phrase provides an indication of a named entity. Additional or alternative implementations include matching a candidate named entity with the portion of the word lattice, and augmenting the word lattice with the matched candidate named entity.

    Speech recognition using two language models

    公开(公告)号:US11341972B2

    公开(公告)日:2022-05-24

    申请号:US17078030

    申请日:2020-10-22

    Applicant: Google LLC

    Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.

    Contextual denormalization for automatic speech recognition

    公开(公告)号:US11282525B2

    公开(公告)日:2022-03-22

    申请号:US17009494

    申请日:2020-09-01

    Applicant: Google LLC

    Abstract: A method includes receiving a speech input from a user and obtaining context metadata associated with the speech input. The method also includes generating a raw speech recognition result corresponding to the speech input and selecting a list of one or more denormalizers to apply to the generated raw speech recognition result based on the context metadata associated with the speech input. The generated raw speech recognition result includes normalized text. The method also includes denormalizing the generated raw speech recognition result into denormalized text by applying the list of the one or more denormalizers in sequence to the generated raw speech recognition result.

Patent Agency Ranking