-
公开(公告)号:US11783850B1
公开(公告)日:2023-10-10
申请号:US17216840
申请日:2021-03-30
Applicant: Amazon Technologies, Inc.
Inventor: Harshavardhan Sundar , Sheetal Laad , Jialiang Bao , Ming Sun , Chao Wang , Chungnam Chan , Cengiz Erbas , Mathias Jourdain , Nipul Bharani , Aaron David Wirshba
CPC classification number: G10L25/51 , G10L15/063 , G10L15/22 , G10L25/78 , G10L2015/0635
Abstract: Techniques for detecting certain acoustic events from audio data are described. A system may perform event aggregation for certain types of events before sending an output to a device representing the event is detected. The system may bypass the event aggregation process for certain types of events that the system may detect with a high level of confidence. In such cases, the system may send an output to the device when the event is detected. The system may be used to detect acoustic events representing presence of a person or other harmful circumstances (such as, fire, smoke, etc.) in a home, an office, a store, or other types of indoor settings.
-
公开(公告)号:US20230027828A1
公开(公告)日:2023-01-26
申请号:US17740910
申请日:2022-05-10
Applicant: Amazon Technologies, Inc.
Inventor: Gustavo Alfonso Aguilar Alas , Viktor Rozgic , Chao Wang
IPC: G10L15/26 , G06F40/284 , G10L15/06 , G10L15/16
Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.
-
公开(公告)号:US11335347B2
公开(公告)日:2022-05-17
申请号:US16429689
申请日:2019-06-03
Applicant: Amazon Technologies, Inc.
Inventor: Gustavo Alfonso Aguilar Alas , Viktor Rozgic , Chao Wang
IPC: G10L15/26 , G06F40/284 , G10L15/06 , G10L15/16
Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.
-
公开(公告)号:US11043218B1
公开(公告)日:2021-06-22
申请号:US16452964
申请日:2019-06-26
Applicant: Amazon Technologies, Inc.
Inventor: Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni
Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
-
公开(公告)号:US11003491B1
公开(公告)日:2021-05-11
申请号:US16138413
申请日:2018-09-21
Applicant: Amazon Technologies, Inc.
Inventor: Timothy David Gasser , Chao Wang
IPC: G06F15/173 , G06F9/48 , G06F9/50 , G06F7/58 , H04L12/26
Abstract: Techniques for optimizing background tasks based on forecast data are described. Customer workloads may be monitored by a local monitor in a first time period. Future customer workloads in a second time period following the first time period may be forecast based at least on the customer workloads using a local model. A background availability may be determined based at least on the future customer workloads. Execution of at least one background workload may be scheduled to use the background availability during the second time period. The local monitor may then cause the execution of the at least one background workload.
-
公开(公告)号:US11790919B2
公开(公告)日:2023-10-17
申请号:US17740910
申请日:2022-05-10
Applicant: Amazon Technologies, Inc.
Inventor: Gustavo Alfonso Aguilar Alas , Viktor Rozgic , Chao Wang
IPC: G10L15/26 , G06F40/284 , G10L15/06 , G10L15/16
CPC classification number: G10L15/26 , G06F40/284 , G10L15/063 , G10L15/16
Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.
-
公开(公告)号:US11545174B2
公开(公告)日:2023-01-03
申请号:US17178844
申请日:2021-02-18
Applicant: Amazon Technologies, Inc.
Inventor: Daniel Kenneth Bone , Chao Wang , Viktor Rozgic
Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.
-
公开(公告)号:US20210249035A1
公开(公告)日:2021-08-12
申请号:US17178844
申请日:2021-02-18
Applicant: Amazon Technologies, Inc.
Inventor: Daniel Kenneth Bone , Chao Wang , Viktor Rozgic
Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.
-
公开(公告)号:US20210027798A1
公开(公告)日:2021-01-28
申请号:US17022197
申请日:2020-09-16
Applicant: Amazon Technologies, Inc.
Inventor: Shiva Kumar Sundaram , Chao Wang , Shiv Naga Prasad Vitaladevuni , Spyridon Matsoukas , Arindam Mandal
Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.
-
公开(公告)号:US10803885B1
公开(公告)日:2020-10-13
申请号:US16023923
申请日:2018-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Chieh-Chi Kao , Chao Wang , Weiran Wang , Ming Sun
Abstract: An audio event detection system that processes audio data into audio feature data and processes the audio feature data using pre-configured candidate interval lengths to identify top candidate regions of the feature data that may include an audio event. The feature data from the top candidate regions are then scored by a classifier, where the score indicates a likelihood that the candidate region corresponds to a desired audio event. The scores are compared to a threshold, and if the threshold is satisfied, the top scoring candidate region is determined to include an audio event.
-
-
-
-
-
-
-
-
-