-
公开(公告)号:US11545174B2
公开(公告)日:2023-01-03
申请号:US17178844
申请日:2021-02-18
Applicant: Amazon Technologies, Inc.
Inventor: Daniel Kenneth Bone , Chao Wang , Viktor Rozgic
Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.
-
公开(公告)号:US20210249035A1
公开(公告)日:2021-08-12
申请号:US17178844
申请日:2021-02-18
Applicant: Amazon Technologies, Inc.
Inventor: Daniel Kenneth Bone , Chao Wang , Viktor Rozgic
Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.
-
公开(公告)号:US11532300B1
公开(公告)日:2022-12-20
申请号:US16913996
申请日:2020-06-26
Applicant: AMAZON TECHNOLOGIES, INC.
Inventor: Daniel Kenneth Bone , Viktor Rozgic , Chao Wang
Abstract: A device with a microphone acquires audio data of a user's speech. A neural network accepts audio data as input and provides sentiment data as output. The neural network is trained using training data based on input from raters who provide votes as to which sentiment descriptors they think are associated with a sample of speech. A vote by a rater assessing the sample for a particular semantic descriptor is distributed to a plurality of semantically similar semantic descriptors. Semantic descriptor similarity data indicates relative similarity between possible semantic descriptors in the semantic space. The distributed partial votes may be aggregated to produce training data comprising samples of speech and weights of corresponding semantic descriptors. The training data is then used to train the neural network. For example, the neural network may be trained with the training data using per-instance cosine similarity loss or correlational loss.
-
公开(公告)号:US10943604B1
公开(公告)日:2021-03-09
申请号:US16456158
申请日:2019-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Daniel Kenneth Bone , Chao Wang , Viktor Rozgic
Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.
-
-
-