Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Shiv Naga Prasad Vitaladevuni"

1.

发明授权
Wakeword detection 有权

公开(公告)号：US11355102B1

公开(公告)日：2022-06-07

申请号：US16712539

申请日：2019-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Yuriy Mishchenko , Thibaud Senechal , Anish N. Shah , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16 , G10L15/06 , G06N3/04 , G06N3/08 , G10L15/30 , G10L15/08

Abstract: A neural network model of a user device is trained to map different words represented in audio data to different points in an N-dimensional embedding space. When the user device determines that a mapped point corresponds to a wakeword, it causes further audio processing, such as automatic speech recognition or natural-language understanding, to be performed on the audio data. The user device may first create the wakeword by first processing audio data representing the wakeword to determine the mapped point in the embedding space.

2.

发明申请
DIALOG MANAGEMENT FOR MULTIPLE USERS 有权

公开(公告)号：US20220093093A1

公开(公告)日：2022-03-24

申请号：US17112227

申请日：2020-12-04

Applicant: Amazon Technologies, Inc.

Inventor： Prakash Krishnan , Arindam Mandal , Nikko Strom , Pradeep Natarajan , Ariya Rastrow , Shiv Naga Prasad Vitaladevuni , David Chi-Wai Tang , Aaron Challenner , Xu Zhang , Krishna Anisetty , Josey Diego Sandoval , Rohit Prasad , Premkumar Natarajan

IPC: G10L15/22 , G10L15/08 , G10L15/24 , G06K9/46 , G06K9/62 , G06K9/00 , G10L15/02

Abstract: A system can operate a speech-controlled device in a mode where the speech-controlled device determines that an utterance is directed at the speech-controlled device using image data showing the user speaking the utterance. If the user is directing the user's gaze at the speech-controlled device while speaking, the system may determine the utterance is system directed and thus may perform further speech processing based on the utterance. If the user's gaze is directed elsewhere, the system may determine the utterance is not system directed (for example directed at another user) and thus the system may not perform further speech processing based on the utterance and may take other actions, for example discarding audio data of the utterance.

3.

发明申请
USER PRESENCE DETECTION 有权

公开(公告)号：US20210027798A1

公开(公告)日：2021-01-28

申请号：US17022197

申请日：2020-09-16

Applicant: Amazon Technologies, Inc.

Inventor： Shiva Kumar Sundaram , Chao Wang , Shiv Naga Prasad Vitaladevuni , Spyridon Matsoukas , Arindam Mandal

IPC: G10L25/30 , G10L25/51 , G10L15/02 , G10L15/16 , G10L15/22 , G10L15/30 , G10L25/78

Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

4.

发明授权
User presence detection 有权

公开(公告)号：US10796716B1

公开(公告)日：2020-10-06

申请号：US16157319

申请日：2018-10-11

Applicant: Amazon Technologies, Inc.

Inventor： Shiva Kumar Sundaram , Chao Wang , Shiv Naga Prasad Vitaladevuni , Spyridon Matsoukas , Arindam Mandal

IPC: G10L15/00 , G10L25/78 , G10L15/22 , G10L15/02 , G10L15/30 , G10L15/16 , G10L15/08

Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

5.

发明授权
Keyword spotting using multi-task configuration 有权

公开(公告)号：US10304440B1

公开(公告)日：2019-05-28

申请号：US15198578

申请日：2016-06-30

Applicant: Amazon Technologies, Inc.

Inventor： Sankaran Panchapagesan , Bjorn Hoffmeister , Arindam Mandal , Aparna Khare , Shiv Naga Prasad Vitaladevuni , Spyridon Matsoukas , Ming Sun

IPC: G10L15/06 , G10L15/08 , G10L15/14 , G10L15/16 , G10L15/28

Abstract: An approach to keyword spotting makes use of acoustic parameters that are trained on a keyword spotting task as well as on a second speech recognition task, for example, a large vocabulary continuous speech recognition task. The parameters may be optimized according to a weighted measure that weighs the keyword spotting task more highly than the other task, and that weighs utterances of a keyword more highly than utterances of other speech. In some applications, a keyword spotter configured with the acoustic parameters is used for trigger or wake word detection.

6.

发明授权
Estimating false rejection rate in a detection system 有权
Title translation: 估计检测系统中的错误拒绝率

公开(公告)号：US09589560B1

公开(公告)日：2017-03-07

申请号：US14135309

申请日：2013-12-19

Applicant: Amazon Technologies, Inc.

Inventor： Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister , Rohit Prasad

IPC: G10L15/01

CPC classification number: G10L15/01 , G06K9/6277

Abstract: Features are disclosed for estimating a false rejection rate in a detection system. The false rejection rate can be estimated by fitting a model to a distribution of detection confidence scores. An estimated false rejection rate can then be computed for confidence scores that fall below a threshold. The false rejection rate and model can be verified once the detection system has been deployed by obtaining additional data with confidence scores falling below the threshold. Adjustments to the model or other operational parameters can be implemented based on the verified false rejection rate, model, or additional data.

Abstract translation: 公开了用于估计检测系统中的假拒绝率的特征。可以通过将模型拟合到检测置信度分数的分布来估计错误拒绝率。然后可以计算低于阈值的置信度分数的估计的错误拒绝率。一旦检测系统被部署，可以通过获得低于阈值的置信度分数的附加数据来验证错误拒绝率和模型。可以基于验证的假拒绝率，模型或附加数据来实现对模型或其他操作参数的调整。

7.

发明授权
Dialog management for multiple users 有权

公开(公告)号：US12039975B2

公开(公告)日：2024-07-16

申请号：US17112512

申请日：2020-12-04

Applicant: Amazon Technologies, Inc.

Inventor： Prakash Krishnan , Arindam Mandal , Siddhartha Reddy Jonnalagadda , Nikko Strom , Ariya Rastrow , Shiv Naga Prasad Vitaladevuni , Angeliki Metallinou , Vincent Auvray , Minmin Shen , Josey Diego Sandoval , Rohit Prasad , Thomas Taylor , Amotz Maimon

IPC: G10L15/22 , G06F3/16 , G06F18/24 , G06V10/40 , G06V40/10 , G06V40/20 , G10L13/08 , G10L15/02 , G10L15/06 , G10L15/08 , G10L15/20 , G10L15/24

CPC classification number: G10L15/22 , G06F3/167 , G06F18/24 , G06V10/40 , G06V40/10 , G06V40/20 , G10L13/08 , G10L15/02 , G10L15/063 , G10L15/08 , G10L15/20 , G10L15/222 , G10L15/24 , G10L2015/0635 , G10L2015/088 , G10L2015/223 , G10L2015/227

Abstract: A natural language system may be configured to act as a participant in a conversation between two users. The system may determine when a user expression such as speech, a gesture, or the like is directed from one user to the other. The system may processing input data related the expression (such as audio data, input data, language processing result data, conversation context data, etc.) to determine if the system should interject a response to the user-to-user expression. If so, the system may process the input data to determine a response and output it. The system may track that response as part of the data related to the ongoing conversation.

8.

发明授权
Predictive deletion of user input 有权

公开(公告)号：US11769496B1

公开(公告)日：2023-09-26

申请号：US16711967

申请日：2019-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Shiv Naga Prasad Vitaladevuni , Prem Natarajan

IPC: G10L15/22 , H04L29/08 , G10L15/18 , G10L15/06 , H04L67/306 , G10L15/08 , G06F21/62 , G10L15/07

CPC classification number: G10L15/22 , G10L15/063 , G10L15/1815 , H04L67/306 , G06F21/6245 , G10L15/07 , G10L2015/088 , G10L2015/223 , G10L2015/227

Abstract: Described are techniques for predicting when data associated with a user input is likely to be selected for deletion. The system may use a trained model to assist with such predictions. The trained model can be configured based on deletions associated with a user profile. An example process can including receiving user input data corresponding to the user profile, and processing the user input data to determine a user command. Based on characteristic data of the user command, the trained model can be used to determine that data corresponding to the user command is likely to be selected for deletion. The trained model can be iteratively updated based on additional user commands, including previously received user commands to delete user input data.

9.

发明公开
WAKEWORD DETECTION USING A NEURAL NETWORK 审中-公开

公开(公告)号：US20230162728A1

公开(公告)日：2023-05-25

申请号：US18070830

申请日：2022-11-29

Applicant: Amazon Technologies, Inc.

Inventor： Christin Jose , Yuriy Mishchenko , Anish N. Shah , Alex Escott , Parind Shah , Shiv Naga Prasad Vitaladevuni , Thibaud Senechal

IPC: G10L15/16 , G06F17/15 , G10L15/06

CPC classification number: G10L15/16 , G06F17/15 , G10L15/063 , G10L2015/088

Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.

10.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11043218B1

公开(公告)日：2021-06-22

申请号：US16452964

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification