-
公开(公告)号:US10777186B1
公开(公告)日:2020-09-15
申请号:US16190047
申请日:2018-11-13
Applicant: Amazon Technologies, Inc.
Inventor: Stefano Stefani , Pramod Gurunath , Ashish Singh , Katrin Kirchoff , Deepikaa Suresh , Varun Sembium Varadarajan , Vasanth Philomin , Vikram Sathyanarayana Anbazhagan , Pu Paul Zhao , Vijit Gupta , Ruoyu Huang
Abstract: Techniques for streaming real-time automated speech recognition (ASR) are described. A user can stream audio data to a frontend service of the ASR service. The frontend service can establish a bi-directional connection to an audio decoder host to perform ASR on the data stream. The audio decoder host may include a streaming ASR engine which can analyze chunks of the audio data stream using an acoustic model to divide the audio data into words, and a language model to identify sentences made of the words spoken in the audio file. The acoustic model can be trained using short audio sentence data (e.g., on the order of 30 seconds to a few minutes), enabling the transcription service to accurately transcribe short chunks of audio data. The results are then punctuated and normalized. The resulting transcript is then streamed back to the user over the bi-directional connection.
-
公开(公告)号:US20250111091A1
公开(公告)日:2025-04-03
申请号:US18478766
申请日:2023-09-29
Applicant: Amazon Technologies, Inc.
Inventor: Karthik Saligrama Shreeram , Varun Sembium Varadarajan , Sanjukta Ghosh , Nidish Rajendran Nair , Surya Ram , Ashwin Shukla , Sachin Bangalore Raj , Ishaan Berry , Ji Hoon Kim , Kartik Mittal , Pankhuri Gupta , Tiejun Zhao
IPC: G06F21/62
Abstract: Intent classification is performed for executing a retrieval augmented generation pipeline for natural language tasks using a generative machine learning model. A natural language generative application with associated data repositories may submit a natural language task. A classification machine learning model is used to determine an intent for the natural language request. A number of iterations of a retrieval pipeline may be determined to perform the natural language task based on the intent. The natural language request may be processed through a retrieval pipeline according to the determined number of iterations before returning a result to the request.
-
公开(公告)号:US11551695B1
公开(公告)日:2023-01-10
申请号:US15931455
申请日:2020-05-13
Applicant: Amazon Technologies, Inc.
Inventor: Vivek Govindan , Varun Sembium Varadarajan , Christian Egon Berkhoff Dossow , Himalay Mohanlal Joriwal , Sai Madhuri Bhavirisetty , Abhinav Kumar , Orestis Lykouropoulos , Akshay Nalwaya , Rahul Gupta , Sravan Babu Bodapati , Liangwei Guo , Julian E. S. Salazar , Yibin Wang , K P N V D S Siva Rama , Calvin Xuan Li , Mohit Narendra Gupta , Asem Rustum , Katrin Kirchhoff , Pu Zhao
Abstract: A transcription service may receive a request from a developer to build a custom speech-to-text model for a specific domain of speech. The custom speech-to-text model for the specific domain may replace a general speech-to-text model or add to a set of one or more speech-to-text models available for transcribing speech. The transcription service may receive a training data and instructions representing tasks. The transcription service may determine respective schedules for executing the instructions based at least in part on dependencies between the tasks. The transcription service may execute the instructions according to the respective schedules to train a speech-to-text model for a specific domain using the training data set. The transcription service may deploy the trained speech-to-text model as part of a network-accessible service for an end user to convert audio in the specific domain into texts.
-
4.
公开(公告)号:US20250110979A1
公开(公告)日:2025-04-03
申请号:US18478647
申请日:2023-09-29
Applicant: Amazon Technologies, Inc.
Inventor: Karthik Saligrama Shreeram , Varun Sembium Varadarajan , Sanjukta Ghosh , Nidish Rajendran Nair , Sachin Bangalore Raj , En Lin , Jeff Gregory Registre , Jaydeep Ramani , Inan Tainwala , Kartik Mittal , Pankhuri Gupta , Tiejun Zhao
IPC: G06F16/33 , G06F16/332
Abstract: Distributed orchestration of data retrieval for generative machine learning model may be performed. When a natural language request to perform a natural language task is received that is associated with a generative application, one or more data retrievers may be selected to access associated data repositories according to a previously specified retrieval configuration for the generative natural language application. The data may then be obtained by the selected data retrievers and used to generate a prompt to a generative machine learning model. A result of the generative machine learning model may then be used to provide a response to the natural language request to perform the natural language task.
-
公开(公告)号:US20240331821A1
公开(公告)日:2024-10-03
申请号:US18194350
申请日:2023-03-31
Applicant: Amazon Technologies, Inc.
Inventor: Vijit Gupta , Matthew Chih-Hui Chiou , Amiya Kishor Chakraborty , Anuroop Arora , Varun Sembium Varadarajan , Sarthak Handa , Amit Vithal Sawant , Glen Herschel Carpenter , Jesse Deng , Mohit Narendra Gupta , Rohil Bhattarai , Samuel Benjamin Schiff , Shane Michael McGookey , Tianze Zhang
Abstract: Systems and methods for performing medical audio summarizing for medical conversations are disclosed. An audio file and meta data for a medical conversation are provided to a medical audio summarization system. A transcription machine learning model is used by the medical audio summarization system to generate a transcript and a natural language processing service of the medical audio summarization system is used to generate a summary of the transcript. The natural language processing service may include at least four machine learning models that identify medical entities in the transcript, identify speaker roles in the transcript, determine sections of the transcript corresponding to the summary, and extract or abstract phrases for the summary. The identified medical entities and speaker roles, determined sections, and extracted or abstracted phrases may then be used to generate the summary.
-
公开(公告)号:US12223259B1
公开(公告)日:2025-02-11
申请号:US16587800
申请日:2019-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Varun Sembium Varadarajan , Sravan Babu Bodapati , Deepthi Devaiah Devanira , Pu Paul Zhao , Katrin Kirchhoff , Yue Yang
IPC: G06F40/166 , G06F18/214 , G06F21/62 , G06F40/279 , G06F40/30 , G06N20/00
Abstract: Techniques for managing access to sensitive data in transcriptions are described. A method for managing access to sensitive data in transcriptions may include receiving a request to generate a redacted transcript of content, obtaining a transcript of the content, sending at least a portion of the transcript to a model endpoint to identify sensitive entities in the transcript, receiving an inference response identifying one or more sensitive entities in the transcript, and generating the redacted transcript based at least one the transcript and the inference response.
-
公开(公告)号:US11487942B1
公开(公告)日:2022-11-01
申请号:US16437338
申请日:2019-06-11
Applicant: Amazon Technologies, Inc.
Inventor: Thiruvarul Selvan Senthivel , Varun Sembium Varadarajan , Borui Zhang , Tiberiu Mircea Doman , Parminder Bhatia , Arun Kumar Ravi , Mohammed Khalilia , Emine Busra Celikkaya
IPC: G06F16/93 , G06F40/30 , G06F40/295 , G06F16/28 , G06F16/31 , G06N3/04 , G06N3/08 , G06F40/284
Abstract: Techniques for entity and relationship detect from unstructured text as a service are described. A service may receive a request to identify entities within a provided unstructured text element, and the service may segment and tokenize the unstructured text and send the result to multiple services implementing multiple deep machine learning models trained to identify particular entities. The service may send additional requests to an additional service or services implementing additional deep machine learning models to identify relationships between detected attributes and ones of the detected entities. The outputs from all services can be analyzed and consolidated into a single result that identifies the entities, any attributes of the entities, and confidence scores indicating the confidence in each detected entity.
-
-
-
-
-
-