Machine learning-based prediction of transcriber performance on a segment of audio

    公开(公告)号:US10607611B1

    公开(公告)日:2020-03-31

    申请号:US16595279

    申请日:2019-10-07

    Abstract: When transcribing large audio files, such as in the case of legal depositions, there are often many transcribers to choose from. Embodiments described herein enable calculation of expected accuracy of transcriptions by transcribers, which can be used to guide the selection of transcribers for specific tasks. In one embodiment, a computer receives a segment of an audio recording that includes speech of a person, and identifies an accent of the person and a topic of the segment. The computer generates feature values based on data that includes the accent and the topic, and utilizes a model to calculate, based on the feature values, an expected accuracy of a transcription of the segment by a certain transcriber. The model is generated based on training data that includes segments of previous audio recordings and values of accuracies of transcriptions, by the certain transcriber, of the segments.

    Human resolution of repeated phrases in a hybrid transcription system

    公开(公告)号:US20210074272A1

    公开(公告)日:2021-03-11

    申请号:US16595211

    申请日:2019-10-07

    Abstract: When transcribing audio recordings, such as legal depositions, phrases may be repeated throughout the recordings, but these repeated phrases get transcribed incorrectly by an automatic speech recognition (ASR) system. In order to assist a transcriber to correctly resolve such phrases, some embodiments described herein involve a computer that receives an audio recording that includes speech, generates a transcription of the audio recording utilizing an ASR system, and clusters segments of the audio recording into clusters of similar utterances. The computer provides a transcriber with certain segments of the audio recording, which include similar utterances belonging to a certain cluster, along with transcriptions of the certain segments. The computer receives from the transcriber: an indication of which of the certain segments include repetitions of a phrase, and a correct transcription of the phrase. The computer then updates the transcription of the audio recording based on the indication and the correct transcription.

    Rapid frontend resolution of transcription-related inquiries by backend transcribers

    公开(公告)号:US10665241B1

    公开(公告)日:2020-05-26

    申请号:US16595032

    申请日:2019-10-07

    Abstract: Being able to rapidly and accurately transcribe long audio recordings, such as same-day transcription of multi-hour legal depositions, is a challenging task. Hybrid transcription, which involves automatic speech recognition (ASR) systems generating initial transcriptions that are then reviewed by human transcribers, can be used to tackle this challenge. However, hybrid transcription may be stymied when the transcribers cannot resolve certain issues in the ASR-generated transcriptions. This disclosure describes rapid resolution of transcription-related inquiries of transcribers. In one embodiment, a computer receives an audio recording that includes speech of multiple people in a room and generates transcriptions of segments of the audio recording utilizing an ASR system. These transcriptions are provided for review of transcribers. The computer receives questions from the transcribers regarding the transcriptions, and transmits the questions to a server in the room, which transmits back answers to the questions by the people in the room.

    User interface to assist in hybrid transcription of audio that includes a repeated phrase

    公开(公告)号:US20210074294A1

    公开(公告)日:2021-03-11

    申请号:US16595264

    申请日:2019-10-07

    Abstract: When transcribing an audio recording, certain phrases may be difficult to resolve, especially if they involve names and/or infrequently used terms. However, often such phrases may be repeated multiple times throughout the audio recording. Embodiments described herein interact with a transcriber to resolve such cases of repeated phrases. In one embodiment, a computer plays segments of an audio recording to the transcriber, and at least some of the segments include an utterance of a phrase. The computer also presents, to the transcriber, transcriptions of the segments, and at least some of the transcriptions do not include a correct transcription of the phrase. The computer receives from the transcriber an indication of which of the segments include an utterance of the phrase and the correct transcription of the phrase, and then updates a transcription of the audio recording accordingly.

    Human-curated glossary for rapid hybrid-based transcription of audio

    公开(公告)号:US10607599B1

    公开(公告)日:2020-03-31

    申请号:US16594809

    申请日:2019-10-07

    Abstract: Described herein are curation of a glossary and its utilization for automatic speech recognition (ASR). In one embodiment, a server receives an audio recording of speech, taken over a period spanning at least two hours. During the first hour, the server generates, utilizing an ASR system, a transcription of a segment of the audio, recorded during the first twenty minutes. The server receives, from a transcriber, a phrase that does not appear in the transcription, but was spoken in the segment, and adds the phrase to a glossary. After the first hour of the period, the server generates, utilizing the ASR system, a second transcription of a second segment of the audio, provides the second transcription and the glossary to a second transcriber, and receives a corrected transcription, in which the second transcriber substituted a second phrase in the second transcription, which was not in the glossary, with the phrase.

Patent Agency Ranking