Ensemble modeling of automatic speech recognition output

    公开(公告)号:US11094326B2

    公开(公告)日:2021-08-17

    申请号:US16056298

    申请日:2018-08-06

    Abstract: One embodiment of the present invention sets forth a technique for performing ensemble modeling of ASR output. The technique includes generating input to a machine learning model from snippets of voice activity in the recording and transcriptions produced by multiple automatic speech recognition (ASR) engines from the recording. The technique also includes applying the machine learning model to the input to select, based on transcriptions of the snippet produced by at least one contributor ASR engine of the multiple ASR engines and at least one selector ASR engine of the multiple ASR engines, a best transcription of the snippet from possible transcriptions of the snippet produced by the multiple ASR engines. The technique further includes storing the best transcription in association with the snippet.

    Characterizing accuracy of ensemble models for automatic speech recognition

    公开(公告)号:US11024315B2

    公开(公告)日:2021-06-01

    申请号:US16297603

    申请日:2019-03-09

    Abstract: One embodiment of the present invention sets forth a technique for analyzing transcriptions of a recording. The technique includes storing per-character differences between a first set of characters from a first transcription of the recording and a second set of characters from a second transcription of the recording in a matrix with a fixed width. The technique also includes encoding the per-character differences in the matrix into a vector of the fixed width. The technique further includes outputting the vector as a representation of a pairwise error rate between the first transcription and the second transcription.

Patent Agency Ranking