Fast and robust unsupervised contextual biasing for speech recognition

    公开(公告)号:US11830477B2

    公开(公告)日:2023-11-28

    申请号:US16993797

    申请日:2020-08-14

    Abstract: An automatic speech recognition (ASR) system that determines a textual representation of a word from a word spoken in a natural language is provided. The ASR system uses an acoustic model, a language model, and a decoder. When the ASR system receives a spoken word, the acoustic model generates word candidates for the spoken word. The language model determines an n-gram score for each word candidate. The n-gram score includes a base score and a bias score. The bias score is based on a logarithmic probability of the word candidate, where the logarithmic probability is derived using a class-based language model where the words are clustered into non-overlapping clusters according to word statistics. The decoder decodes a textual representation of the spoken word from the word candidates and the corresponding n-gram score for each word candidate.

    Systems and methods for query autocompletion

    公开(公告)号:US11625436B2

    公开(公告)日:2023-04-11

    申请号:US17119941

    申请日:2020-12-11

    Abstract: Embodiments described herein provide a query autocompletion (QAC) framework at subword level. Specifically, the QAC framework employs a subword encoder that encodes or converts the sequence of input alphabet letters into a sequence of output subwords. The generated subword candidate sequences from the subword encoder is then for the n-gram language model to perform beam search on. For example, as user queries for search engines are in general short, e.g., ranging from 10 to 30 characters. The n-gram language model at subword level may be used for modeling such short contexts and outperforms the traditional language model in both completion accuracy and runtime speed. Furthermore, key computations are performed prior to the runtime to prepare segmentation candidates in support of the subword encoder to generate subword candidate sequences, thus eliminating significant computational overhead.

    Fast and Robust Unsupervised Contextual Biasing for Speech Recognition

    公开(公告)号:US20210343274A1

    公开(公告)日:2021-11-04

    申请号:US16993797

    申请日:2020-08-14

    Abstract: An automatic speech recognition (ASR) system that determines a textual representation of a word from a word spoken in a natural language is provided. The ASR system uses an acoustic model, a language model, and a decoder. When the ASR system receives a spoken word, the acoustic model generates word candidates for the spoken word. The language model determines an n-gram score for each word candidate. The n-gram score includes a base score and a bias score. The bias score is based on a logarithmic probability of the word candidate, where the logarithmic probability is derived using a class-based language model where the words are clustered into non-overlapping clusters according to word statistics. The decoder decodes a textual representation of the spoken word from the word candidates and the corresponding n-gram score for each word candidate.

Patent Agency Ranking