Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Katrin Kirchoff"

1.

发明授权
Streaming real-time automatic speech recognition service 有权

公开(公告)号：US10777186B1

公开(公告)日：2020-09-15

申请号：US16190047

申请日：2018-11-13

Applicant: Amazon Technologies, Inc.

Inventor： Stefano Stefani , Pramod Gurunath , Ashish Singh , Katrin Kirchoff , Deepikaa Suresh , Varun Sembium Varadarajan , Vasanth Philomin , Vikram Sathyanarayana Anbazhagan , Pu Paul Zhao , Vijit Gupta , Ruoyu Huang

IPC: G10L15/00 , G10L15/06 , G10L25/78 , G10L15/30 , G10L15/04 , G10L15/26 , G10L15/183

Abstract: Techniques for streaming real-time automated speech recognition (ASR) are described. A user can stream audio data to a frontend service of the ASR service. The frontend service can establish a bi-directional connection to an audio decoder host to perform ASR on the data stream. The audio decoder host may include a streaming ASR engine which can analyze chunks of the audio data stream using an acoustic model to divide the audio data into words, and a language model to identify sentences made of the words spoken in the audio file. The acoustic model can be trained using short audio sentence data (e.g., on the order of 30 seconds to a few minutes), enabling the transcription service to accurately transcribe short chunks of audio data. The results are then punctuated and normalized. The resulting transcript is then streamed back to the user over the bi-directional connection.

2.

发明申请
GUIDING TRANSCRIPT GENERATION USING DETECTED SECTION TYPES AS PART OF AUTOMATIC SPEECH RECOGNITION 有权

公开(公告)号：US20250029612A1

公开(公告)日：2025-01-23

申请号：US18356117

申请日：2023-07-20

Applicant: Amazon Technologies, Inc.

Inventor： Lei Xu , Aparna Elangovan , Rohit Paturi , Sundararajan Srinivasan , Sravan BAbu Bodapati , Katrin Kirchoff , Sarthak Handa

IPC: G10L15/26 , G06F40/20

Abstract: Transcript generation as part of automatic speech recognition may be guided using section types. Audio data is received for transcription. An initial transcript of the audio data may be generated and evaluated to determine a section type for the audio data. The section type may then be used to focus generation of a second version of the transcript on one speaker over another speaker.

Patent Agency Ranking