-
公开(公告)号:US12014748B1
公开(公告)日:2024-06-18
申请号:US16988423
申请日:2020-08-07
Applicant: Amazon Technologies, Inc.
Inventor: Ritwik Giri , Mehmet Umut Isik , Neerad Dilip Phansalkar , Jean-Marc Valin , Karim Helwani , Arvindh Krishnaswamy
IPC: G10L21/0208 , G06N5/04 , G06N20/00 , G10L21/034
CPC classification number: G10L21/0208 , G06N5/04 , G06N20/00 , G10L21/034 , G10L2021/02082
Abstract: Techniques for training and using a machine learning model for estimation of reverberation in a multi-task learning framework are described. According to some embodiments, the multi-task learning framework improves the performance of the machine learning model by estimating the amount of reverberation present in an input audio recording as a secondary task to the primary task of generating a clean speech portion of the input audio recording. In one embodiment, a model architecture is selected that takes a noisy reverberant recording as an input and outputs an estimate of a clean (e.g., de-reverberated) signal, an estimate of noise (e.g., background noise), and an estimate of the reverb only portion, with the secondary task of estimating the reverb only portion acting as a regularizer that improves the machine learning model's performance in enhancing the reverberant (e.g., and noisy) input speech.
-
公开(公告)号:US12205039B1
公开(公告)日:2025-01-21
申请号:US17087181
申请日:2020-11-02
Applicant: Amazon Technologies, Inc.
Inventor: Ritwik Giri , Srikanth Venkata Tenneti , Karim Helwani , Fangzhou Cheng , Mehmet Umut Isik , Arvindh Krishnaswamy
Abstract: A group masked autoencoder may be implemented for anomaly detection. An autoencoder network model may be trained without supervision and applied to output an estimated joint probability distribution of normality for a group of frames of time series data. The estimated joint probability distribution may be used to determine an anomaly score for the time series data. An anomaly may be detected according to the anomaly score and a result that indicates a detected anomaly may be provided.
-
公开(公告)号:US12167223B2
公开(公告)日:2024-12-10
申请号:US17810303
申请日:2022-06-30
Applicant: Amazon Technologies, Inc.
Inventor: Masahito Togami , Karim Helwani , Jean-Marc Valin , Michael Mark Goodwin
IPC: H04S7/00 , G10L21/0216 , H04S1/00
Abstract: Real-time low-complexity stereo speech enhancement with spatial cue preservation may be performed. A stereo speech enhancement system receives a stereo input signal (e.g., a left and right input signal). The stereo speech enhancement system estimates spatial cues for a target speaker and downmixes the stereo input signal into a monaural signal. A low-complexity model may then process the monaural signal to generate an enhanced monaural signal. The stereo speech enhancement system upmixes the enhanced monaural signal based on the estimated spatial cues for the target speaker, to generate an enhanced stereo output signal.
-
公开(公告)号:US11924367B1
公开(公告)日:2024-03-05
申请号:US17668297
申请日:2022-02-09
Applicant: Amazon Technologies, Inc.
Inventor: Jean-Marc Valin , Karim Helwani , Srikanth Venkata Tenneti , Erfan Soltanmohammadi , Mehmet Umut Isik , Richard Newman , Michael Mark Goodwin , Arvindh Krishnaswamy
IPC: H04M3/00 , G10L21/0232 , G10L21/034 , G10L25/18 , H04S3/00 , G10L21/0208
CPC classification number: H04M3/002 , G10L21/0232 , G10L21/034 , G10L25/18 , H04S3/008 , G10L2021/02082 , H04S2400/01 , H04S2400/03
Abstract: Joint noise and echo suppression may be performed for enhancing two-way audio communications. Audio data is captured at a communication device and audio data transmitted to the communication device from another communication device are used as input features to a trained machine learning model that uses the transmitted audio data as a reference signal to eliminate residual echo in the captured audio data when also suppressing noise in the captured audio data.
-
公开(公告)号:US11875810B1
公开(公告)日:2024-01-16
申请号:US17489538
申请日:2021-09-29
Applicant: Amazon Technologies, Inc.
Inventor: Karim Helwani , Emmanouil Theodosis
IPC: G10L21/02 , G10L21/0208 , H04M9/08 , G06N3/045 , G10L21/0216
CPC classification number: G10L21/0208 , G06N3/045 , H04M9/082 , G10L2021/02082 , G10L2021/02166
Abstract: At a first layer of an echo canceler, a first compensation for a first set of properties of output of an audio capture device of a first communication environment is applied. The first set of properties includes a property resulting from a difference in clock speeds of an audio capture device and an audio rendering device of the first communication environment. At a second layer of the echo canceler, at which output of the first layer is received, a second compensation for a second set of properties of the output of the first layer is applied. The second set of properties includes an echo. Applying the compensations comprises modifying neural network weights. Output of the second layer is transmitted to a second communication environment.
-
公开(公告)号:US20240007817A1
公开(公告)日:2024-01-04
申请号:US17810303
申请日:2022-06-30
Applicant: Amazon Technologies, Inc.
Inventor: Masahito Togami , Karim Helwani , Jean-Marc Valin , Michael Mark Goodwin
IPC: H04S7/00 , H04S1/00 , G10L21/0216
CPC classification number: H04S7/303 , H04S1/007 , G10L21/0216 , H04S2400/03 , H04S2400/11 , H04S2400/15
Abstract: Real-time low-complexity stereo speech enhancement with spatial cue preservation may be performed. A stereo speech enhancement system receives a stereo input signal (e.g., a left and right input signal). The stereo speech enhancement system estimates spatial cues for a target speaker and downmixes the stereo input signal into a monaural signal. A low-complexity model may then process the monaural signal to generate an enhanced monaural signal. The stereo speech enhancement system upmixes the enhanced monaural signal based on the estimated spatial cues for the target speaker, to generate an enhanced stereo output signal.
-
公开(公告)号:US11468354B1
公开(公告)日:2022-10-11
申请号:US16708747
申请日:2019-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Karim Helwani , Alexander Caughron , Amin Hani Atrash , Aarthi Raveendran , Kevin V. Macwan
Abstract: A system may perform adaptive target presence probability to predict a location of a user (e.g., target) at a given time based on accumulated observations. For example, the system may track a location of the user over time and generate observation data associated with a user profile. The observation data may include a plurality of observations, with a single observation corresponding to a location and time at which the user was detected. The system may apply a clustering algorithm to the observation data to generate probability distributions (e.g., clusters) associated with discrete locations. For example, the system may group observations that are in proximity to each other and separate the groups based on location. Using the probability distribution for an individual cluster, the system may determine a likelihood that the user is present at a location corresponding to the cluster.
-
公开(公告)号:US12175434B2
公开(公告)日:2024-12-24
申请号:US17039649
申请日:2020-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Srikanth Venkata Tenneti , Arvindh Krishnaswamy , Karim Helwani , Mehmet Umut Isik , Ritwik Giri , Fangzhou Cheng , Aparna Pandey
IPC: G06Q10/20 , G06F16/21 , G06F16/906
Abstract: Systems, methods, and apparatuses for detecting anomalies using clusters are described. In some examples, a method includes receiving a request to perform anomaly detection using a plurality of clusters; receiving a data point; determining when the received data point is a part of one of the plurality of clusters utilizing a distance to centers of the one or more clusters, wherein: when the received data point is determined to belong to a normal cluster, assigning the received data point to the determined cluster, updating the cluster, and updating a history for the cluster, when the received data point is determined to belong to an anomalous cluster, raising an anomaly, updating the cluster, and updating a history for the cluster, and when the received data point is determined to not belong to any cluster, raising an anomaly.
-
公开(公告)号:US12008457B1
公开(公告)日:2024-06-11
申请号:US17037515
申请日:2020-09-29
Applicant: Amazon Technologies, Inc.
Inventor: Mehmet Umut Isik , Ritwik Giri , Neerad Dilip Phansalkar , Jean-Marc Valin , Karim Helwani , Arvindh Krishnaswamy
Abstract: Audio processing may be performed with a convolutional neural network that includes positional embeddings. Audio data may be received at an audio processing system. A convolutional neural network that concatenates frequency-positional embeddings at an input layer may be used to process the audio data. A result of processing the audio data through the convolutional neural network may be used to perform an audio processing task.
-
公开(公告)号:US20220101270A1
公开(公告)日:2022-03-31
申请号:US17039649
申请日:2020-09-30
Applicant: Amazon Technologies, Inc.
Inventor: Srikanth Venkata Tenneti , Arvindh Krishnaswamy , Karim Helwani , Mehmet Umut Isik , Ritwik Giri , Fangzhou Cheng , Aparna Pandey
IPC: G06Q10/00 , G06F16/906 , G06F16/21
Abstract: Systems, methods, and apparatuses for detecting anomalies using clusters are described. In some examples, a method includes receiving a request to perform anomaly detection using a plurality of clusters; receiving a data point; determining when the received data point is a part of one of the plurality of clusters utilizing a distance to centers of the one or more clusters, wherein: when the received data point is determined to belong to a normal cluster, assigning the received data point to the determined cluster, updating the cluster, and updating a history for the cluster, when the received data point is determined to belong to an anomalous cluster, raising an anomaly, updating the cluster, and updating a history for the cluster, and when the received data point is determined to not belong to any cluster, raising an anomaly.
-
-
-
-
-
-
-
-
-