Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Thibaud Senechal"

1.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11132990B1

公开(公告)日：2021-09-28

申请号：US16453063

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/08 , G10L25/87 , G06F3/16 , G10L25/21

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

2.

发明公开
WAKEWORD DETECTION USING A NEURAL NETWORK 审中-公开

公开(公告)号：US20230162728A1

公开(公告)日：2023-05-25

申请号：US18070830

申请日：2022-11-29

Applicant: Amazon Technologies, Inc.

Inventor： Christin Jose , Yuriy Mishchenko , Anish N. Shah , Alex Escott , Parind Shah , Shiv Naga Prasad Vitaladevuni , Thibaud Senechal

IPC: G10L15/16 , G06F17/15 , G10L15/06

CPC classification number: G10L15/16 , G06F17/15 , G10L15/063 , G10L2015/088

Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.

3.

发明授权
Wakeword and acoustic event detection 有权

公开(公告)号：US11043218B1

公开(公告)日：2021-06-22

申请号：US16452964

申请日：2019-06-26

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

4.

发明授权
Speech processing using a recurrent neural network 有权

公开(公告)号：US11205420B1

公开(公告)日：2021-12-21

申请号：US16436562

申请日：2019-06-10

Applicant: Amazon Technologies, Inc.

Inventor： Gengshen Fu , Thibaud Senechal , Shiv Naga Prasad Vitaladevuni , Michael J. Rodehorst , Varun K. Nagaraja

IPC: G10L15/16 , G10L15/22 , G10L15/06 , G06N3/04 , G06N3/02 , G10L25/30 , G10L15/08

Abstract: A system and method performs wakeword detection using a neural network model that includes a recurrent neural network (RNN) for processing variable-length wakewords. To prevent the model from being influenced by non-wakeword speech, multiple instances of the model are created to process audio data, and each instance is configured to use weights determined by training data. The model may instead or in addition be used to process the audio data only when a likelihood that the audio data corresponds to the wakeword is greater than a threshold. The model may process the audio data as represented by groups of acoustic feature vectors; computations for feature vectors common to different groups may be re-used.

5.

发明授权
Text detection using features associated with neighboring glyph pairs 有权
Title translation: 使用与相邻字形对相关联的功能的文本检测

公开(公告)号：US09367736B1

公开(公告)日：2016-06-14

申请号：US14842125

申请日：2015-09-01

Applicant: Amazon Technologies, Inc.

Inventor： Thibaud Senechal , Quan Wang , Daniel Makoto Willenson , Shuang Wu , Yue Liu , Shiv Naga Prasad Vitaladevuni , David Paul Ramos , Qingfeng Yu

IPC: G06K9/46 , G06K9/00 , G06K9/34

CPC classification number: G06K9/00463 , G06K9/00442 , G06K9/00456 , G06K9/344 , G06K9/348 , G06K9/4638 , G06K9/4652 , G06K2209/01

Abstract: A multi-orientation text detection method and associated system is disclosed that utilizes orientation-variant glyph features to determine a text line in an image regardless of an orientation of the text line. Glyph features are determined for each glyph in an image with respect to a neighboring glyph. The glyph features are provided to a learned classifier that outputs a glyph pair score for each neighboring glyph pair. Each glyph pair score indicates a likelihood that the corresponding pair of neighboring glyphs form part of a same text line. The glyph pair scores are used to identify candidate text lines, which are then ranked to select a final set of text lines in the image.

Abstract translation: 公开了一种多方向文本检测方法和相关系统，其利用取向变体字形特征来确定图像中的文本行，而不管文本行的取向如何。为相对于相邻字形的图像中的每个字形确定字形特征。字形特征被提供给学习的分类器，其为每个相邻字形对输出字形对分数。每个字形对得分表示对应的相邻字形对形成相同文本行的一部分的可能性。字形对分数用于识别候选文本行，然后将其排序以选择图像中的最后一组文本行。

6.

发明授权
Wakeword detection using a neural network 有权

公开(公告)号：US11521599B1

公开(公告)日：2022-12-06

申请号：US16577351

申请日：2019-09-20

Applicant: Amazon Technologies, Inc.

Inventor： Christin Jose , Yuriy Mishchenko , Anish N. Shah , Alex Escott , Parind Shah , Shiv Naga Prasad Vitaladevuni , Thibaud Senechal

IPC: G10L15/16 , G06F17/15 , G10L15/06 , G10L15/08

Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.

7.

发明申请
WAKEWORD AND ACOUSTIC EVENT DETECTION 有权

公开(公告)号：US20210358497A1

公开(公告)日：2021-11-18

申请号：US17321999

申请日：2021-05-17

Applicant: Amazon Technologies, Inc.

Inventor： Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

8.

发明授权
Wakeword detection 有权

公开(公告)号：US11355102B1

公开(公告)日：2022-06-07

申请号：US16712539

申请日：2019-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Yuriy Mishchenko , Thibaud Senechal , Anish N. Shah , Shiv Naga Prasad Vitaladevuni

IPC: G10L15/22 , G10L15/16 , G10L15/06 , G06N3/04 , G06N3/08 , G10L15/30 , G10L15/08

Abstract: A neural network model of a user device is trained to map different words represented in audio data to different points in an N-dimensional embedding space. When the user device determines that a mapped point corresponds to a wakeword, it causes further audio processing, such as automatic speech recognition or natural-language understanding, to be performed on the audio data. The user device may first create the wakeword by first processing audio data representing the wakeword to determine the mapped point in the embedding space.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification