Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Michael Mark Goodwin"

1.

发明公开
Multi-Talker Audio Stream Separation, Transcription and Diaraization 审中-公开

公开(公告)号：US20240096346A1

公开(公告)日：2024-03-21

申请号：US17850617

申请日：2022-06-27

Applicant: Amazon Technologies, Inc.

Inventor： Masahito Togami , Ritwik Giri , Michael Mark Goodwin , Arvindh . Krishnaswamy , Siddhartha Shankara Rao

IPC: G10L21/10 , G10L15/04 , G10L21/0208

CPC classification number: G10L21/10 , G10L15/04 , G10L21/0208

Abstract: A plurality of talker embedding vectors may be derived that correspond to a plurality of talkers in an input audio stream. Each talker embedding vector may represent respective voice characteristics of a respective talker. The talker embedding vectors may be generated based on, for example, a pre-enrollment process or a cluster-based embedding vector derivation process. A plurality of instances of a personalized noise suppression model may be executed on the input audio stream. Each instance of the personalized noise suppression model may employ a respective talker embedding vector. A plurality of single-talker audio streams may be generated by the plurality of instances of the personalized noise suppression model. A plurality of single-talker transcriptions may be generated based on the plurality of single-talker audio streams. The plurality of single-talker transcriptions may be merged into a multi-talker output transcription.

2.

发明授权
Videoconference content sharing for public switched telephone network participants 有权

公开(公告)号：US11909787B1

公开(公告)日：2024-02-20

申请号：US17710294

申请日：2022-03-31

Applicant: Amazon Technologies, Inc.

Inventor： John Joseph Dunne , Siddhartha Shankara Rao , Michael Mark Goodwin

IPC: H04L65/403 , H04L65/1089

CPC classification number: H04L65/403 , H04L65/1089

Abstract: A videoconference among a plurality of participants may be hosted, wherein the plurality of participants comprise Internet Protocol (IP)-connected participants and a Public Switched Telephone Network (PSTN)-connected participant. The IP-connected participants may send and receive audio content and video content of the videoconference via IP-based connections. The PSTN-connected participant may send and receive the audio content of the videoconference via a PSTN connection. Additional content from the videoconference may also be transmitted to the PSTN-connected participant, for example as text messages via the PSTN connection. The additional content may include, for example, images of a videoconference screen share, chat posts, polls, and the like. Images may be transmitted in the additional content based on video status change events, such as switching slides or pages in a screen share. In some examples, bidirectional messaging may allow contents of text messages from the PSTN-connected user to be displayed in the videoconference.

3.

发明申请
UNIFIED AUDIO SUPPRESSION MODEL 有权

公开(公告)号：US20250111857A1

公开(公告)日：2025-04-03

申请号：US18478759

申请日：2023-09-29

Applicant: Amazon Technologies, Inc.

Inventor： Ritwik Giri , Zhepei Wang , Devansh Shah , Jean-Marc Valin , Michael Mark Goodwin

IPC: G10L21/0208 , G10L25/30 , H04M3/56

Abstract: Examples herein provide an approach to enhance an audio mixture of a teleconference application by switching between noise suppression modes using a single model. Specifically, a machine learning (ML) model may be configured to, in response to receiving an audio mixture representation as input, suppress either a background noise of the audio mixture or suppress all noise of the audio mixture except a user's voice. In some examples, the ML model may be trained on speech and background noise training data during a training phase. In addition, the ML model may be trained on a user's voice during an enrollment phase. In addition, during an inference phase, the ML model may enhance the audio mixture by suppressing a portion of the audio mixture.

4.

发明授权
Separate representations of videoconference participants that use a shared device 有权

公开(公告)号：US12010459B1

公开(公告)日：2024-06-11

申请号：US17710731

申请日：2022-03-31

Applicant: Amazon Technologies, Inc.

Inventor： John Joseph Dunne , Michael Klingbeil , Michael Mark Goodwin , Siddhartha Shankara Rao

IPC: H04N7/15 , G06V40/16 , G10L17/06

CPC classification number: H04N7/15 , G06V40/172 , G10L17/06

Abstract: A plurality of device-sharing participants may be detected that are participating in a videoconference via a shared computing device. The detecting of the plurality of device-sharing participants may be performed based, at least in part, on at least one of an audio analysis of captured audio from one or more microphones or a video analysis of captured video from one or more cameras. A plurality of participant connections corresponding to the plurality of device-sharing participants may be joined to the videoconference. Each of the plurality of participant connections may be identified within the videoconference using a respective name. A plurality of video streams and a plurality of audio streams corresponding to the plurality of participant connections may be transmitted, and the plurality of video streams and the plurality of audio streams may be presented to at least one other conference participant.

5.

发明授权
Real-time low-complexity stereo speech enhancement with spatial cue preservation 有权

公开(公告)号：US12167223B2

公开(公告)日：2024-12-10

申请号：US17810303

申请日：2022-06-30

Applicant: Amazon Technologies, Inc.

Inventor： Masahito Togami , Karim Helwani , Jean-Marc Valin , Michael Mark Goodwin

IPC: H04S7/00 , G10L21/0216 , H04S1/00

Abstract: Real-time low-complexity stereo speech enhancement with spatial cue preservation may be performed. A stereo speech enhancement system receives a stereo input signal (e.g., a left and right input signal). The stereo speech enhancement system estimates spatial cues for a target speaker and downmixes the stereo input signal into a monaural signal. A low-complexity model may then process the monaural signal to generate an enhanced monaural signal. The stereo speech enhancement system upmixes the enhanced monaural signal based on the estimated spatial cues for the target speaker, to generate an enhanced stereo output signal.

6.

发明授权
Joint noise and echo suppression for two-way audio communication enhancement 有权

公开(公告)号：US11924367B1

公开(公告)日：2024-03-05

申请号：US17668297

申请日：2022-02-09

Applicant: Amazon Technologies, Inc.

Inventor： Jean-Marc Valin , Karim Helwani , Srikanth Venkata Tenneti , Erfan Soltanmohammadi , Mehmet Umut Isik , Richard Newman , Michael Mark Goodwin , Arvindh Krishnaswamy

IPC: H04M3/00 , G10L21/0232 , G10L21/034 , G10L25/18 , H04S3/00 , G10L21/0208

CPC classification number: H04M3/002 , G10L21/0232 , G10L21/034 , G10L25/18 , H04S3/008 , G10L2021/02082 , H04S2400/01 , H04S2400/03

Abstract: Joint noise and echo suppression may be performed for enhancing two-way audio communications. Audio data is captured at a communication device and audio data transmitted to the communication device from another communication device are used as input features to a trained machine learning model that uses the transmitted audio data as a reference signal to eliminate residual echo in the captured audio data when also suppressing noise in the captured audio data.

7.

发明公开
REAL-TIME LOW-COMPLEXITY STEREO SPEECH ENHANCEMENT WITH SPATIAL CUE PRESERVATION 审中-公开

公开(公告)号：US20240007817A1

公开(公告)日：2024-01-04

申请号：US17810303

申请日：2022-06-30

Applicant: Amazon Technologies, Inc.

Inventor： Masahito Togami , Karim Helwani , Jean-Marc Valin , Michael Mark Goodwin

IPC: H04S7/00 , H04S1/00 , G10L21/0216

CPC classification number: H04S7/303 , H04S1/007 , G10L21/0216 , H04S2400/03 , H04S2400/11 , H04S2400/15

Abstract: Real-time low-complexity stereo speech enhancement with spatial cue preservation may be performed. A stereo speech enhancement system receives a stereo input signal (e.g., a left and right input signal). The stereo speech enhancement system estimates spatial cues for a target speaker and downmixes the stereo input signal into a monaural signal. A low-complexity model may then process the monaural signal to generate an enhanced monaural signal. The stereo speech enhancement system upmixes the enhanced monaural signal based on the estimated spatial cues for the target speaker, to generate an enhanced stereo output signal.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification