Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Yixin Gao"

1.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11043218B1

公开(公告)日：2021-06-22

申请号：US16452964

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

2.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11132990B1

公开(公告)日：2021-09-28

申请号：US16453063

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/08 , G10L25/87 , G06F3/16 , G10L25/21

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

3.

发明授权
Multilingual wakeword detection 有权

公开(公告)号：US11069353B1

公开(公告)日：2021-07-20

申请号：US16404536

申请日：2019-05-06

Applicant: Amazon Technologies, Inc.

Inventor： Yixin Gao , Ming Sun , Jason Krone , Shiv Naga Prasad Vitaladevuni , Yuzong Liu

IPC: G10L25/00 , G10L15/00 , G10L15/04 , G10L15/22 , G10L15/08 , G10L25/78 , G10L15/14 , G10L15/16

Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.

4.

发明授权
Acoustic trigger detection 有权

公开(公告)号：US10460722B1

公开(公告)日：2019-10-29

申请号：US15639175

申请日：2017-06-30

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , David Snyder , Yixin Gao , Nikko Strom , Spyros Matsoukas , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/06 , G10L15/16 , H04M1/27 , G06F3/16 , G10L15/32 , G10L15/22

Abstract: A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by combining a “time delay” structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance. For a given amount of computation capacity the combination of these two techniques provides improved accuracy as compared to current approaches.

5.

发明授权
Multilingual wakeword detection 有权

公开(公告)号：US11996097B2

公开(公告)日：2024-05-28

申请号：US17359937

申请日：2021-06-28

Applicant: Amazon Technologies, Inc.

Inventor： Yixin Gao , Ming Sun , Jason Krone , Shiv Naga Prasad Vitaladevuni , Yuzong Liu

IPC: G10L15/00 , G10L15/08 , G10L15/14 , G10L15/16 , G10L15/22 , G10L25/78 , G06F40/263

CPC classification number: G10L15/22 , G10L15/005 , G10L15/08 , G10L15/142 , G10L15/16 , G10L25/78 , G06F40/263 , G10L2015/088

Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.

6.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11670299B2

公开(公告)日：2023-06-06

申请号：US17321999

申请日：2021-05-17

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechai , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

CPC classification number: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

7.

发明授权
Wakeword detection using multi-word model 有权

公开(公告)号：US11308939B1

公开(公告)日：2022-04-19

申请号：US16140737

申请日：2018-09-25

Applicant: Amazon Technologies, Inc.

Inventor： Yixin Gao , Ming Sun , Varun Nagaraja , Gengshen Fu , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/14 , G10L15/22 , G06F3/16 , G10L15/08

Abstract: A system and method performs wakeword detection and automatic speech recognition using the same acoustic model. A mapping engine maps phones/senones output by the acoustic model to phones/senones corresponding to the wakeword. A hidden Markov model (HMM) may determine that the wakeword is present in audio data; the HMM may have multiple paths for multiple wakewords or may have multiple models. Once the wakeword is detected, ASR is performed using the acoustic model.

8.

发明申请
MULTILINGUAL WAKEWORD DETECTION 有权

公开(公告)号：US20210398533A1

公开(公告)日：2021-12-23

申请号：US17359937

申请日：2021-06-28

Applicant: Amazon Technologies, Inc.

Inventor： Yixin Gao , Ming Sun , Jason Krone , Shiv Naga Prasad Vitaladevuni , Yuzong Liu

IPC: G10L15/22 , G10L15/08 , G10L25/78 , G10L15/14 , G10L15/16 , G10L15/00

Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.

9.

发明申请
WAKEWORD AND ACOUSTIC EVENT DETECTION 有权

公开(公告)号：US20210358497A1

公开(公告)日：2021-11-18

申请号：US17321999

申请日：2021-05-17

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

10.

发明授权
Binary target acoustic trigger detecton 有权

公开(公告)号：US10460729B1

公开(公告)日：2019-10-29

申请号：US15639254

申请日：2017-06-30

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Aaron Lee Mathers Challenner , Yixin Gao , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L25/30 , G10L15/02 , G10L15/16

Abstract: A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by using a neural network to determine an indicator of presence of the acoustic trigger. In some example, the neural network combines a “time delay” structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification