Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Viktor Rozgic"

1.

发明授权
Sentiment detection in audio data 有权

公开(公告)号：US11854538B1

公开(公告)日：2023-12-26

申请号：US16277328

申请日：2019-02-15

Applicant: Amazon Technologies, Inc.

Inventor： Viktor Rozgic , Chao Wang , Ming Sun , Srinivas Parthasarathy

IPC: G10L15/18 , G10L15/06 , G10L15/07 , G10L15/16 , G10L15/02

CPC classification number: G10L15/1815 , G10L15/02 , G10L15/063 , G10L15/07 , G10L15/16

Abstract: Described herein is a system for sentiment detection in audio data. The system processes audio frame level features of input audio data using a machine learning algorithm to classify the input audio data into a particular sentiment category. The machine learning algorithm may be a neural network trained using an encoder-decoder method. The training of the machine learning algorithm may include normalization techniques to avoid potential bias in the training data that may occur when the training data is annotated for a perceived sentiment of the speaker.

2.

发明授权
Multiple classifications of audio data 有权

公开(公告)号：US11790919B2

公开(公告)日：2023-10-17

申请号：US17740910

申请日：2022-05-10

Applicant: Amazon Technologies, Inc.

Inventor： Gustavo Alfonso Aguilar Alas , Viktor Rozgic , Chao Wang

IPC: G10L15/26 , G06F40/284 , G10L15/06 , G10L15/16

CPC classification number: G10L15/26 , G06F40/284 , G10L15/063 , G10L15/16

Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.

3.

发明授权
Emotion detection using speaker baseline 有权

公开(公告)号：US11545174B2

公开(公告)日：2023-01-03

申请号：US17178844

申请日：2021-02-18

Applicant: Amazon Technologies, Inc.

Inventor： Daniel Kenneth Bone , Chao Wang , Viktor Rozgic

IPC: G10L25/63 , G10L17/04

Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.

4.

发明申请
EMOTION DETECTION USING SPEAKER BASELINE 有权

公开(公告)号：US20210249035A1

公开(公告)日：2021-08-12

申请号：US17178844

申请日：2021-02-18

Applicant: Amazon Technologies, Inc.

Inventor： Daniel Kenneth Bone , Chao Wang , Viktor Rozgic

IPC: G10L25/63 , G10L17/04

Abstract: Described herein is a system for emotion detection in audio data using a speaker's baseline. The baseline may represent a user's speaking style in a neutral emotional state. The system is configured to compare the user's baseline with input audio representing speech from the user to determine a emotion of the user. The system may store multiple baselines for the user, each associated with a different context (e.g., environment, activity, etc.), and select one of the baselines to compare with the input audio based on the contextual situation.

5.

发明授权
Character-level emotion detection 有权

公开(公告)号：US11869535B1

公开(公告)日：2024-01-09

申请号：US16711883

申请日：2019-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Mohammad Taha Bahadori , Viktor Rozgic , Alexander Jonathan Pinkus , Chao Wang , David Heckerman

IPC: G10L25/63 , G10L15/22 , G10L15/16 , G10L15/06 , G06N3/04 , G06N3/08 , G10L15/18 , G06N3/044

CPC classification number: G10L25/63 , G06N3/044 , G06N3/08 , G10L15/063 , G10L15/16 , G10L15/1815 , G10L15/22 , G10L2015/223

Abstract: Described is a system and method that determines character sequences from speech, without determining the words of the speech, and processes the character sequences to determine sentiment data indicative of emotional state of a user that output the speech. The emotional state may then be presented or provided as an output to the user.

6.

发明申请
MULTIPLE CLASSIFICATIONS OF AUDIO DATA 有权

公开(公告)号：US20230027828A1

公开(公告)日：2023-01-26

申请号：US17740910

申请日：2022-05-10

Applicant: Amazon Technologies, Inc.

Inventor： Gustavo Alfonso Aguilar Alas , Viktor Rozgic , Chao Wang

IPC: G10L15/26 , G06F40/284 , G10L15/06 , G10L15/16

Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.

7.

发明授权
Multiple classifications of audio data 有权

公开(公告)号：US11335347B2

公开(公告)日：2022-05-17

申请号：US16429689

申请日：2019-06-03

Applicant: Amazon Technologies, Inc.

Inventor： Gustavo Alfonso Aguilar Alas , Viktor Rozgic , Chao Wang

IPC: G10L15/26 , G06F40/284 , G10L15/06 , G10L15/16

Abstract: Described herein is a system for sentiment detection in audio data. The system is trained using acoustic information and lexical information to determine a sentiment corresponding to an utterance. In some cases when lexical information is not available, the system (trained on acoustic and lexical information) is configured to determine a sentiment using only acoustic information.

8.

发明授权
Self-supervised federated learning 有权

公开(公告)号：US12039998B1

公开(公告)日：2024-07-16

申请号：US17665129

申请日：2022-02-04

Applicant: Amazon Technologies, Inc.

Inventor： Chieh-Chi Kao , Qingming Tang , Ming Sun , Viktor Rozgic , Spyridon Matsoukas , Chao Wang

IPC: G10L25/78 , G06N3/045 , G06N3/08 , G10L25/21

CPC classification number: G10L25/78 , G06N3/045 , G06N3/08 , G10L25/21

Abstract: An acoustic event detection system may employ self-supervised federated learning to update encoder and/or classifier machine learning models. In an example operation, an encoder may be pre-trained to extract audio feature data from an audio signal. A decoder may be pre-trained to predict a subsequent portion of audio data (e.g., a subsequent frame of audio data represented by log filterbank energies). The encoder and decoder may be trained using self-supervised learning to improve the decoder's predictions and, by extension, the quality of the audio feature data generated by the encoder. The system may apply federated learning to share encoder updates across user devices. The system may fine-tune the classifier to improve inferences based on the improved audio feature data. The system may distribute classifier updates to the user device(s) to update the on-device classifier.

9.

发明授权
Media presence detection 有权

公开(公告)号：US11069352B1

公开(公告)日：2021-07-20

申请号：US16278440

申请日：2019-02-18

Applicant: Amazon Technologies, Inc.

Inventor： Qingming Tang , Ming Sun , Chieh-Chi Kao , Chao Wang , Viktor Rozgic

IPC: G10L15/22 , G10L25/78 , G10L15/16 , G10L15/02

Abstract: Described herein is a system for media presence detection in audio. The system analyzes audio data to recognize whether a given audio segment contains sounds from a media source as a way of differentiating recorded media source sounds from other live sounds. In exemplary embodiments, the system includes a hierarchical model architecture for processing audio data segments, where individual audio data segments are processed by a trained machine learning model operating locally, and another trained machine learning model provides historical and contextual information to determine a score indicating the likelihood that the audio data segment contains sounds from a media source.

10.

发明授权
Acoustic event detection 有权

公开(公告)号：US12087320B1

公开(公告)日：2024-09-10

申请号：US17671194

申请日：2022-02-14

Applicant: Amazon Technologies, Inc.

Inventor： Qin Zhang , Qingming Tang , Ming Sun , Chao Wang , Steve Mark Lorusso , Andrew Thomas Bydlon , James Garnet Droppo , Viktor Rozgic , Sripal Mehta , Yang Liu

IPC: G10L25/51 , G10L15/18 , G10L15/22 , G10L15/30

CPC classification number: G10L25/51 , G10L15/1815 , G10L15/22 , G10L15/30

Abstract: A system may be configured to detect custom acoustic events, where the system generates an acoustic event profile for the custom acoustic event based on a natural language description provided by a user and using an audio sample of the described acoustic event. For example, the user may describe the custom acoustic event as “dog bark.” The system may ask the user questions to refine the description (e.g., dog breed, dog gender, age, etc.). Using an audio sample of the refined description, the system may then determine that audio captured in the user's environment is a potential sample of the custom acoustic event. Such captured audio may be presented to the user for confirmation, and then may be used to detect future occurrences of the custom acoustic event in the user's environment.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification