Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Chao Wang"

1.

发明授权
Self-supervised federated learning 有权

公开(公告)号：US12039998B1

公开(公告)日：2024-07-16

申请号：US17665129

申请日：2022-02-04

Applicant: Amazon Technologies, Inc.

Inventor： Chieh-Chi Kao , Qingming Tang , Ming Sun , Viktor Rozgic , Spyridon Matsoukas , Chao Wang

IPC: G10L25/78 , G06N3/045 , G06N3/08 , G10L25/21

CPC classification number: G10L25/78 , G06N3/045 , G06N3/08 , G10L25/21

Abstract: An acoustic event detection system may employ self-supervised federated learning to update encoder and/or classifier machine learning models. In an example operation, an encoder may be pre-trained to extract audio feature data from an audio signal. A decoder may be pre-trained to predict a subsequent portion of audio data (e.g., a subsequent frame of audio data represented by log filterbank energies). The encoder and decoder may be trained using self-supervised learning to improve the decoder's predictions and, by extension, the quality of the audio feature data generated by the encoder. The system may apply federated learning to share encoder updates across user devices. The system may fine-tune the classifier to improve inferences based on the improved audio feature data. The system may distribute classifier updates to the user device(s) to update the on-device classifier.

2.

发明授权
Streaming self-attention in a neural network 有权

公开(公告)号：US11961514B1

公开(公告)日：2024-04-16

申请号：US17547610

申请日：2021-12-10

Applicant: Amazon Technologies, Inc.

Inventor： Chia-Jung Chang , Qingming Tang , Ming Sun , Chao Wang

IPC: G10L15/16 , G10L15/14 , G10L17/16

CPC classification number: G10L15/16

Abstract: An acoustic event detection system may employ one or more recurrent neural networks (RNNs) to extract features from audio data, and use the extracted features to determine the presence of an acoustic event. The system may use self-attention to emphasize features extracted from portions of audio data that may include features more useful for detecting acoustic events. The system may perform self-attention in an iterative manner to reduce the amount of memory used to store hidden states of the RNN while processing successive portions of the audio data. The system may process the portions of the audio data using the RNN to generate a hidden state for each portion. The system may calculate an interim embedding for each hidden state. An interim embedding calculated for the last hidden state may be normalized to determine a final embedding representing features extracted from the input data by the RNN.

3.

发明公开
USER PRESENCE DETECTION 审中-公开

公开(公告)号：US20230410833A1

公开(公告)日：2023-12-21

申请号：US18131531

申请日：2023-04-06

Applicant: Amazon Technologies, Inc.

Inventor： Shiva Kumar Sundaram , Chao Wang , Shiv Naga Prasad Vitaladevuni , Spyridon Matsoukas , Arindam Mandal

IPC: G10L25/30 , G10L25/51 , G10L15/02 , G10L15/16 , G10L15/22 , G10L15/30 , G10L25/78

CPC classification number: G10L25/30 , G10L25/51 , G10L15/02 , G10L15/16 , G10L15/22 , G10L15/30 , G10L25/78 , G10L2015/088

Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

4.

发明公开
ACOUSTIC EVENT DETECTION 审中-公开

公开(公告)号：US20230186939A1

公开(公告)日：2023-06-15

申请号：US17547644

申请日：2021-12-10

Applicant: Amazon Technologies, Inc.

Inventor： Qingming Tang , Chieh-Chi Kao , Qin Zhang , Ming Sun , Chao Wang , Sumit Garg , Rong Chen , James Garnet Droppo , Chia-Jung Chang

IPC: G10L25/51 , G10L25/21 , G10L25/30 , G06N3/08 , G06N3/04

CPC classification number: G10L25/51 , G10L25/21 , G10L25/30 , G06N3/08 , G06N3/0454 , G10L15/22

Abstract: A system may include a first acoustic event detection (AED) component configured to detect a predetermined set of acoustic events, and include a second AED component configured to detect custom acoustic events that a user configures a device to detect. The first and second AED components are configured to perform task-specific processing, and may receive as input the same acoustic feature data corresponding to audio data that potentially represents occurrence of one or more events. Based on processing by the first and second AED components, a device may output data indicating that one or more acoustic events occurred, where the acoustic events may be a predetermined acoustic event and/or a custom acoustic event.

5.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11670299B2

公开(公告)日：2023-06-06

申请号：US17321999

申请日：2021-05-17

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechai , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

CPC classification number: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

6.

发明授权
Wakeword detection using multi-word model 有权

公开(公告)号：US11308939B1

公开(公告)日：2022-04-19

申请号：US16140737

申请日：2018-09-25

Applicant: Amazon Technologies, Inc.

Inventor： Yixin Gao , Ming Sun , Varun Nagaraja , Gengshen Fu , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/14 , G10L15/22 , G06F3/16 , G10L15/08

Abstract: A system and method performs wakeword detection and automatic speech recognition using the same acoustic model. A mapping engine maps phones/senones output by the acoustic model to phones/senones corresponding to the wakeword. A hidden Markov model (HMM) may determine that the wakeword is present in audio data; the HMM may have multiple paths for multiple wakewords or may have multiple models. Once the wakeword is detected, ASR is performed using the acoustic model.

7.

发明申请
WAKEWORD AND ACOUSTIC EVENT DETECTION 有权

公开(公告)号：US20210358497A1

公开(公告)日：2021-11-18

申请号：US17321999

申请日：2021-05-17

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

8.

发明授权
Media presence detection 有权

公开(公告)号：US11069352B1

公开(公告)日：2021-07-20

申请号：US16278440

申请日：2019-02-18

Applicant: Amazon Technologies, Inc.

Inventor： Qingming Tang , Ming Sun , Chieh-Chi Kao , Chao Wang , Viktor Rozgic

IPC: G10L15/22 , G10L25/78 , G10L15/16 , G10L15/02

Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.

9.

发明授权
Acoustic event detection 有权

公开(公告)号：US12068001B2

公开(公告)日：2024-08-20

申请号：US18243800

申请日：2023-09-08

Applicant: Amazon Technologies, Inc.

Inventor： Harshavardhan Sundar , Sheetal Laad , Jialiang Bao , Ming Sun , Chao Wang , Chungnam Chan , Cengiz Erbas , Mathias Jourdain , Nipul Bharani , Aaron David Wirshba

IPC: G10L25/51 , G10L15/06 , G10L15/22 , G10L25/78

CPC classification number: G10L25/51 , G10L15/063 , G10L15/22 , G10L25/78 , G10L2015/0635

Abstract: Techniques for detecting certain acoustic events from audio data are described. A system may perform event aggregation for certain types of events before sending an output to a device representing the event is detected. The system may bypass the event aggregation process for certain types of events that the system may detect with a high level of confidence. In such cases, the system may send an output to the device when the event is detected. The system may be used to detect acoustic events representing presence of a person or other harmful circumstances (such as, fire, smoke, etc.) in a home, an office, a store, or other types of indoor settings.

10.

发明授权
Character-level emotion detection 有权

公开(公告)号：US11869535B1

公开(公告)日：2024-01-09

申请号：US16711883

申请日：2019-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Mohammad Taha Bahadori , Viktor Rozgic , Alexander Jonathan Pinkus , Chao Wang , David Heckerman

IPC: G10L25/63 , G10L15/22 , G10L15/16 , G10L15/06 , G06N3/04 , G06N3/08 , G10L15/18 , G06N3/044

CPC classification number: G10L25/63 , G06N3/044 , G06N3/08 , G10L15/063 , G10L15/16 , G10L15/1815 , G10L15/22 , G10L2015/223

Abstract: Described is a system and method that determines character sequences from speech, without determining the words of the speech, and processes the character sequences to determine sentiment data indicative of emotional state of a user that output the speech. The emotional state may then be presented or provided as an output to the user.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification