Contextual Biasing for Speech Recognition
    23.
    发明公开

    公开(公告)号:US20230274736A1

    公开(公告)日:2023-08-31

    申请号:US18311964

    申请日:2023-05-04

    Applicant: Google LLC

    CPC classification number: G10L15/187 G06N20/10 G10L19/04 G10L2015/088

    Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.

    Contextual Biasing With Text Injection
    30.
    发明公开

    公开(公告)号:US20240153498A1

    公开(公告)日:2024-05-09

    申请号:US18490861

    申请日:2023-10-20

    Applicant: Google LLC

    CPC classification number: G10L15/16 G10L15/063 G10L15/183

    Abstract: A method includes receiving context biasing data that includes a set of unspoken textual utterances corresponding to a particular context. The method also includes obtaining a list of carrier phrases associated with the particular context. For each respective unspoken textual utterance, the method includes generating a corresponding training data pair that includes the respective unspoken textual utterance and a carrier phrase. For each respective training data pair, the method includes tokenizing the respective training data pair into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit, receiving the first higher order textual feature representation, and generating a first probability distribution over possible text units. The method also includes training a speech recognition model based on the first probability distribution over possible text units.

Patent Agency Ranking