-
公开(公告)号:US11094326B2
公开(公告)日:2021-08-17
申请号:US16056298
申请日:2018-08-06
Applicant: Cisco Technology, Inc.
Inventor: Ahmad Abdulkader , Mohamed Gamal Mohamed Mahmoud
Abstract: One embodiment of the present invention sets forth a technique for performing ensemble modeling of ASR output. The technique includes generating input to a machine learning model from snippets of voice activity in the recording and transcriptions produced by multiple automatic speech recognition (ASR) engines from the recording. The technique also includes applying the machine learning model to the input to select, based on transcriptions of the snippet produced by at least one contributor ASR engine of the multiple ASR engines and at least one selector ASR engine of the multiple ASR engines, a best transcription of the snippet from possible transcriptions of the snippet produced by the multiple ASR engines. The technique further includes storing the best transcription in association with the snippet.
-
公开(公告)号:US12087276B1
公开(公告)日:2024-09-10
申请号:US17155825
申请日:2021-01-22
Applicant: Cisco Technology, Inc.
Abstract: A plurality of audio datasets associated with captured audio are provided to a plurality of automatic speech recognition engines, wherein each of the automatic speech recognition engines is configured to recognize speech of a first language. Word error rate estimates that comprise at least one word error rate estimate for each of the plurality of audio datasets are determined from outputs of the plurality of automatic speech recognition engines. From the word error rate estimates, audio in the plurality of audio datasets is determined to include speech in a second language.
-
公开(公告)号:US11380315B2
公开(公告)日:2022-07-05
申请号:US16297602
申请日:2019-03-09
Applicant: Cisco Technology, Inc.
Inventor: Ahmad Abdulkader , Mohamed Gamal Mohamed Mahmoud
Abstract: One embodiment of the present invention sets forth a technique for analyzing a transcription of a recording. The technique includes generating features representing transcriptions produced by multiple automatic speech recognition (ASR) engines from voice activity in the recording and a best transcription of the recording produced by an ensemble model from the transcriptions. The technique also includes applying a machine learning model to the features to produce a score representing an accuracy of the best transcription. The technique further includes storing the score in association with the best transcription.
-
公开(公告)号:US11024315B2
公开(公告)日:2021-06-01
申请号:US16297603
申请日:2019-03-09
Applicant: Cisco Technology, Inc.
Inventor: Ahmad Abdulkader , Mohamed Gamal Mohamed Mahmoud
Abstract: One embodiment of the present invention sets forth a technique for analyzing transcriptions of a recording. The technique includes storing per-character differences between a first set of characters from a first transcription of the recording and a second set of characters from a second transcription of the recording in a matrix with a fixed width. The technique also includes encoding the per-character differences in the matrix into a vector of the fixed width. The technique further includes outputting the vector as a representation of a pairwise error rate between the first transcription and the second transcription.
-
-
-