Patent search ap:("AMAZON TECHNOLOGIES Page INC.") AND inv:"Bjorn Hoffmeister"

11.

发明授权
Keyword detection modeling using contextual information 有权

公开(公告)号：US10832662B2

公开(公告)日：2020-11-10

申请号：US15641169

申请日：2017-07-03

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister

IPC: G10L15/18 , G10L15/08 , G10L15/30

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

12.

发明授权
Neural network processing of multiple feature streams using max pooling and restricted connectivity 有权

公开(公告)号：US09886948B1

公开(公告)日：2018-02-06

申请号：US14589631

申请日：2015-01-05

Applicant: Amazon Technologies, Inc.

Inventor： Sri Venkata Surya Siva Rama Krishna Garimella , Bjorn Hoffmeister

IPC: G10L15/16 , G10L15/02 , G10L15/00 , G10L21/02 , G10L15/14 , G10L15/18 , G10L21/0208

CPC classification number: G10L15/16 , G10L15/00 , G10L15/02 , G10L15/14 , G10L15/18 , G10L15/183 , G10L21/02 , G10L21/0208 , G10L2015/025

Abstract: Features are disclosed for improving the robustness of a neural network by using multiple (e.g., two or more) feature streams, combing data from the feature streams, and comparing the combined data to data from a subset of the feature streams (e.g., comparing values from the combined feature stream to values from one of the component feature streams of the combined feature stream). The neural network can include a component or layer that selects the data with the highest value, which can suppress or exclude some or all corrupted data from the combined feature stream. Subsequent layers of the neural network can restrict connections from the combined feature stream to a component feature stream to reduce the possibility that a corrupted combined feature stream will corrupt the component feature stream.

13.

发明授权
Markov-based sequence tagging using neural networks 有权

公开(公告)号：US09600764B1

公开(公告)日：2017-03-21

申请号：US14307412

申请日：2014-06-17

Applicant: Amazon Technologies, Inc.

Inventor： Ariya Rastrow , Spyros Matsoukas , Sri Venkata Surya Siva Rama Krishna Garimella , Nikko Ström , Bjorn Hoffmeister

IPC: G06F15/18 , G06N3/08

CPC classification number: G06N3/08 , G06N3/0445 , G06N3/049

Abstract: Features are disclosed for using a neural network to tag sequential input without using an internal representation of the neural network generated when scoring previous positions in the sequence. A predicted or determined label (e.g., the highest scoring or otherwise most probable label) for input at a given position in the sequence can be used when scoring input corresponding to the next position the sequence. Additional features are disclosed for training a neural network for use in tagging sequential input without using an internal representation of the neural network generated when scoring previous positions the sequence.

14.

发明申请
SPEECH MODEL RETRIEVAL IN DISTRIBUTED SPEECH RECOGNITION SYSTEMS 审中-公开
Title translation: 分布式语音识别系统中的语音模型检索

公开(公告)号：US20160071519A1

公开(公告)日：2016-03-10

申请号：US14942551

申请日：2015-11-16

Applicant: Amazon Technologies, Inc.

Inventor： Bjorn Hoffmeister , Hugh Evan Secker-Walker , Jeffrey Cornelius O'Neill

IPC: G10L15/32 , G10L15/30

CPC classification number: G10L15/32 , G10L15/22 , G10L15/30

Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

Abstract translation: 公开了用于管理语音识别模型和自动语音识别系统中的数据的使用的特征。可以异步检索模型和数据，并在收到文字或使用更为一般或不同的模型对话语进行初始处理之后进行使用。一旦收到，模型和统计信息可以被缓存。还可以异步检索更新模型和数据所需的统计数据，以便可以在模型和数据可用时更新模型和数据。可以立即使用更新的模型和数据来重新处理话语，或者保存用于处理随后接收的话语。可以跟踪与自动语音识别系统的用户交互，以便预测用户什么时候可能利用该系统。基于这样的预测，模型和数据可以被预先缓存。

15.

发明授权
Using adaptation data with cloud-based speech recognition 有权
Title translation: 使用基于云的语音识别的适应数据

公开(公告)号：US08996372B1

公开(公告)日：2015-03-31

申请号：US13664363

申请日：2012-10-30

Applicant: Amazon Technologies, Inc.

Inventor： Hugh Secker-Walker , Bjorn Hoffmeister , Ryan Thomas , Stan Salvador , Karthik Ramakrishnan

IPC: G10L15/00 , G10L15/06 , G10L15/34

CPC classification number: G10L15/34

Abstract: Speech recognition may be improved using data derived from an utterance. In some embodiments, audio data is received by a user device. Adaptation data may be retrieved from a data store accessible by the user device. The audio data and the adaptation data may be transmitted to a server device. The server device may use the audio data to calculate second adaptation data. The second adaptation data may be transmitted to the user device. Synchronously or asynchronously, the server device may perform speech recognition using the audio data and the second adaptation data and transmit speech recognition results back to the user device.

Abstract translation: 可以使用从话语导出的数据来改善语音识别。在一些实施例中，音频数据由用户设备接收。可以从用户设备可访问的数据存储器中检索适配数据。音频数据和适配数据可以被发送到服务器设备。服务器设备可以使用音频数据来计算第二自适应数据。第二适配数据可以被发送到用户设备。同步或异步地，服务器设备可以使用音频数据和第二自适应数据来执行语音识别，并将语音识别结果发送回用户设备。

16.

发明授权
Endpointing in speech processing 有权

公开(公告)号：US12211517B1

公开(公告)日：2025-01-28

申请号：US17475699

申请日：2021-09-15

Applicant: Amazon Technologies, Inc.

Inventor： Roland Maximilian Rolf Maas , Bjorn Hoffmeister , Ariya Rastrow , James Garnet Droppo , Veerdhawal Pande , Maarten Van Segbroeck , Gautam Tiwari , Andrew Smith , Eli Joshua Fidler

IPC: G10L25/78 , G06N3/045 , G10L15/26 , G10L25/30

Abstract: A speech-processing system may determine potential endpoints in a user's speech. Such endpoint prediction may include determining a potential endpoint in a stream of audio data, and may additionally including determining an endpoint score representing a likelihood that the potential endpoint represents an end of speech representing a complete user input. When the potential endpoint has been determined, the system may publish a transcript of speech that preceded the potential endpoint, and send it to downstream components. The system may continue to transcribe audio data and determine additional potential endpoints while the downstream components process the transcript. The downstream components may determine whether the transcript is complete; e.g., represents the entirety of the user input. Final endpoint determinations may be made based on the results of the downstream processing including automatic speech recognition, natural language understanding, etc.

17.

发明公开
DEVICE-DIRECTED UTTERANCE DETECTION 审中-公开

公开(公告)号：US20230223023A1

公开(公告)日：2023-07-13

申请号：US18149181

申请日：2023-01-03

Applicant: Amazon Technologies, Inc.

Inventor： Ariya Rastrow , Eli Joshua Fidler , Roland Maximilian Rolf Maas , Nikko Strom , Aaron Eakin , Diamond Bishop , Bjorn Hoffmeister , Sanjeev Mishra

IPC: G10L15/22 , G10L15/18 , G10L15/26 , G10L15/08

CPC classification number: G10L15/22 , G10L15/26 , G10L15/1815 , G10L2015/088 , G10L2015/223 , G10L2015/228

Abstract: A speech interface device is configured to detect an interrupt event and process a voice command without detecting a wakeword. The device includes on-device interrupt architecture configured to detect when device-directed speech is present and send audio data to a remote system for speech processing. This architecture includes an interrupt detector that detects an interrupt event (e.g., device-directed speech) with low latency, enabling the device to quickly lower a volume of output audio and/or perform other actions in response to a potential voice command. In addition, the architecture includes a device directed classifier that processes an entire utterance and corresponding semantic information and detects device-directed speech with high accuracy. Using the device directed classifier, the device may reject the interrupt event and increase a volume of the output audio or may accept the interrupt event, causing the output audio to end and performing speech processing on the audio data.

18.

发明授权
Wake word detection modeling 有权

公开(公告)号：US11657804B2

公开(公告)日：2023-05-23

申请号：US17090716

申请日：2020-11-05

Applicant: Amazon Technologies, Inc.

Inventor： Rohit Prasad , Kenneth John Basye , Spyridon Matsoukas , Rajiv Ramachandran , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister

IPC: G10L15/18 , G10L15/08 , G10L15/30

CPC classification number: G10L15/18 , G10L15/08 , G10L15/30 , G10L2015/088

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

19.

发明授权
Language model adaptation 有权

公开(公告)号：US11302310B1

公开(公告)日：2022-04-12

申请号：US16426557

申请日：2019-05-30

Applicant: Amazon Technologies, Inc.

Inventor： Ankur Gandhe , Ariya Rastrow , Roland Maximilian Rolf Maas , Bjorn Hoffmeister

IPC: G10L15/01 , G10L15/065 , G10L15/06

Abstract: Exemplary embodiments relate to adapting a generic language model during runtime using domain-specific language model data. The system performs an audio frame-level analysis, to determine if the utterance corresponds to a particular domain and whether the ASR hypothesis needs to be rescored. The system processes, using a trained classifier, the ASR hypothesis (a partial hypothesis) generated for the audio data processed so far. The system determines whether to rescore the hypothesis after every few audio frames (representing a word in the utterance) are processed by the speech recognition system.

20.

发明授权
Monophone-based background modeling for wakeword detection 有权

公开(公告)号：US10964315B1

公开(公告)日：2021-03-30

申请号：US15639330

申请日：2017-06-30

Applicant: Amazon Technologies, Inc.

Inventor： Minhua Wu , Sankaran Panchapagesan , Ming Sun , Shiv Naga Prasad Vitaladevuni , Bjorn Hoffmeister , Ryan Paul Thomas , Arindam Mandal

IPC: G10L15/22 , G10L15/14 , G10L15/02 , G10L15/16 , G10L25/30 , G10L15/08

Abstract: An approach to wakeword detection uses an explicit representation of non-wakeword speech in the form of subword (e.g., phonetic monophone) units that do not necessarily occur in the wakeword and that broadly represent general speech. These subword units are arranged in a “background” model, which at runtime essentially competes with the wakeword model such that a wakeword is less likely to be declare as occurring when the input matches that background model well. An HMM may be used with the model to locate possible occurrences of the wakeword. Features are determined from portions of the input corresponding to subword units of the wakeword detected using the HMM. A secondary classifier is then used to process the features to yield a decision of whether the wakeword occurred.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification