Patent search ap:("AMAZON TECHNOLOGIES Page INC.") AND inv:"Hugh Evan Secker-Walker"

1.

发明授权
Intent-specific automatic speech recognition result generation 有权

公开(公告)号：US10811013B1

公开(公告)日：2020-10-20

申请号：US14137563

申请日：2013-12-20

Applicant: Amazon Technologies, Inc.

Inventor： Hugh Evan Secker-Walker , Aaron Lee Mathers Challenner , Ariya Rastrow

IPC: G10L15/19 , G10L15/26

Abstract: Features are disclosed for generating intent-specific results in an automatic speech recognition system. The results can be generated by utilizing a decoding graph containing tags that identify portions of the graph corresponding to a given intent. The tags can also identify high-information content slots and low-information carrier phrases for a given intent. The automatic speech recognition system may utilize these tags to provide a semantic representation based on a plurality of different tokens for the content slot portions and low information for the carrier portions. A user can be presented with a user interface containing top intent results with corresponding intent-specific top content slot values.

2.

发明申请
SPEECH MODEL RETRIEVAL IN DISTRIBUTED SPEECH RECOGNITION SYSTEMS 审中-公开
Title translation: 分布式语音识别系统中的语音模型检索

公开(公告)号：US20160071519A1

公开(公告)日：2016-03-10

申请号：US14942551

申请日：2015-11-16

Applicant: Amazon Technologies, Inc.

Inventor： Bjorn Hoffmeister , Hugh Evan Secker-Walker , Jeffrey Cornelius O'Neill

IPC: G10L15/32 , G10L15/30

CPC classification number: G10L15/32 , G10L15/22 , G10L15/30

Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

Abstract translation: 公开了用于管理语音识别模型和自动语音识别系统中的数据的使用的特征。可以异步检索模型和数据，并在收到文字或使用更为一般或不同的模型对话语进行初始处理之后进行使用。一旦收到，模型和统计信息可以被缓存。还可以异步检索更新模型和数据所需的统计数据，以便可以在模型和数据可用时更新模型和数据。可以立即使用更新的模型和数据来重新处理话语，或者保存用于处理随后接收的话语。可以跟踪与自动语音识别系统的用户交互，以便预测用户什么时候可能利用该系统。基于这样的预测，模型和数据可以被预先缓存。

3.

发明授权
Incremental utterance processing and semantic stability determination 有权

公开(公告)号：US10102851B1

公开(公告)日：2018-10-16

申请号：US14012262

申请日：2013-08-28

Applicant: Amazon Technologies, Inc.

Inventor： Imre Attila Kiss , Hugh Evan Secker-Walker

IPC: G10L15/22 , G10L15/18

Abstract: Incremental speech recognition results are generated and used to determine a user's intent from an utterance. Utterance audio data may be partitioned into multiple portions, and incremental speech recognition results may be generated from one or more of the portions. A natural language understanding module or some other language processing module can generate semantic representations of the utterance from the incremental speech recognition results. Stability of the determined intent may be determined over the course of time, and actions may be taken in response to meeting certain stability thresholds.

4.

发明授权
Reducing speech recognition latency 有权
Title translation: 降低语音识别延迟

公开(公告)号：US09514747B1

公开(公告)日：2016-12-06

申请号：US14011898

申请日：2013-08-28

Applicant: Amazon Technologies, Inc.

Inventor： Michael Maximilian Emanuel Bisani , Hugh Evan Secker-Walker , Kenneth John Basye , Alexander David Rosen

IPC: G10L21/00 , G10L25/93 , G10L15/00 , G10L15/26 , G10L17/00 , G10L15/04 , G10L25/00 , G10L15/22

CPC classification number: G10L15/08 , G10L25/60 , G10L2015/085

Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to reduce a latency of returning speech results to a user. The latency may be determined by comparing a time stamp of an utterance in process to a current time. Latency may also be estimated based on an endpoint of the utterance or other considerations such as how difficult the utterance may be to process. To improve latency the ASR system may be configured to adjust various processing parameters, such as graph pruning factors, path weights, ASR models, etc. Latency checks and corrections may occur dynamically for a particular utterance while it is being processed, thus allowing the ASR system to adjust to rapidly changing latency conditions.

Abstract translation: 在自动语音识别（ASR）处理系统中，ASR处理可以被配置为减少向用户返回语音结果的等待时间。可以通过将处理中的话语的时间戳与当前时间进行比较来确定等待时间。延迟也可以基于话语的终点或其他考虑来估计，例如话语可能难以处理。为了改善延迟，ASR系统可以被配置为调整各种处理参数，例如图形剪枝因子，路径权重，ASR模型等。在正在处理的情况下，潜在检查和校正可以针对特定话语动态地发生，从而允许ASR 系统调整到快速变化的潜伏期条件。

5.

发明授权
Generation and use of multiple speech processing transforms 有权
Title translation: 多语音处理转换的生成与使用

公开(公告)号：US09218806B1

公开(公告)日：2015-12-22

申请号：US13892167

申请日：2013-05-10

Applicant: Amazon Technologies, Inc.

Inventor： Stan Weidner Salvador , Shengbin Yang , Hugh Evan Secker-Walker , Karthik Ramakrishnan

IPC: G10L15/02

CPC classification number: G10L15/02 , G10L15/30 , G10L15/32

Abstract: Features are disclosed for selecting and using multiple transforms associated with a particular remote device for use in automatic speech recognition (“ASR”). Each transform may be based on statistics that have been generated from processing utterances that share some characteristic (e.g., acoustic characteristics, time frame within which the utterances where processed, etc.). When an utterance is received from the remote device, a particular transform or set of transforms may be selected for use in speech processing based on data obtained from the remote device, speech processing of a portion of the utterance, speech processing of prior utterances, etc. The transform or transforms used in processing the utterances may then be updated based on the results of the speech processing.

Abstract translation: 公开了用于选择和使用与特定远程设备相关联的用于自动语音识别（“ASR”）的多个变换的特征。每个变换可以基于已经从共享一些特征的处理话语（例如，声学特性，其中处理的话语的时间框架等）产生的统计信息。当从远程设备接收到话语时，可以基于从远程设备获得的数据，话音的一部分的语音处理，先前语音的语音处理等，选择特定的变换或变换集合用于语音处理然后可以基于语音处理的结果更新用于处理话语的变换或变换。

6.

发明授权
Speech recognition power management 有权

公开(公告)号：US11322152B2

公开(公告)日：2022-05-03

申请号：US16443160

申请日：2019-06-17

Applicant: Amazon Technologies, Inc.

Inventor： Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy

IPC: G10L15/28 , G10L25/78 , G10L15/08 , G10L15/30

Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.

7.

发明授权
Speech recognition power management 有权

公开(公告)号：US09704486B2

公开(公告)日：2017-07-11

申请号：US13711510

申请日：2012-12-11

Applicant: Amazon Technologies, Inc.

Inventor： Kenneth John Basye , Hugh Evan Secker-Walker , Tony David , Reinhard Kneser , Jeffrey Penrod Adams , Stan Weidner Salvador , Mahesh Krishnamoorthy

IPC: G10L15/00 , G10L15/04 , G10L15/14 , G10L15/20 , G10L17/00 , G10L21/00 , G10L25/00 , G10L15/28 , G10L25/78 , G10L15/08 , G10L15/30

CPC classification number: G10L15/28 , G10L15/30 , G10L25/78 , G10L2015/088

Abstract: Power consumption for a computing device may be managed by one or more keywords. For example, if an audio input obtained by the computing device includes a keyword, a network interface module and/or an application processing module of the computing device may be activated. The audio input may then be transmitted via the network interface module to a remote computing device, such as a speech recognition server. Alternately, the computing device may be provided with a speech recognition engine configured to process the audio input for on-device speech recognition.

8.

发明申请
AUTOMATIC SPEAKER IDENTIFICATION USING SPEECH RECOGNITION FEATURES 审中-公开

公开(公告)号：US20170140761A1

公开(公告)日：2017-05-18

申请号：US15420018

申请日：2017-01-30

Applicant: Amazon Technologies, Inc.

Inventor： Hugh Evan Secker-Walker , Baiyang Liu , Frederick Victor Weber

IPC: G10L17/06 , G10L17/16 , G10L17/02 , G10L15/18 , G10L17/22

CPC classification number: G10L17/06 , G10L15/18 , G10L15/20 , G10L15/26 , G10L17/02 , G10L17/12 , G10L17/16 , G10L17/22 , G10L2015/025 , G10L2015/088

Abstract: Features are disclosed for automatically identifying a speaker. Artifacts of automatic speech recognition (“ASR”) and/or other automatically determined information may be processed against individual user profiles or models. Scores may be determined reflecting the likelihood that individual users made an utterance. The scores can be based on, e.g., individual components of Gaussian mixture models (“GMMs”) that score best for frames of audio data of an utterance. A user associated with the highest likelihood score for a particular utterance can be identified as the speaker of the utterance. Information regarding the identified user can be provided to components of a spoken language processing system, separate applications, etc.

9.

发明授权
Load balancing for automatic speech recognition 有权
Title translation: 用于自动语音识别的负载平衡

公开(公告)号：US09269355B1

公开(公告)日：2016-02-23

申请号：US13831286

申请日：2013-03-14

Applicant: Amazon Technologies, Inc.

Inventor： Hugh Evan Secker-Walker , Naresh Narayanan

IPC: G10L15/00 , G10L15/22

CPC classification number: G10L15/30

Abstract: Features are disclosed for transferring speech recognition workloads between pooled execution resources. For example, various parts of an automatic speech recognition engine may be implemented by various pools of servers. Servers in a speech recognition pool may explore a plurality of paths in a graph to find the path that best matches an utterance. A set of active nodes comprising the last node explored in each path may be transferred between servers in the pool depending on resource availability at each server. A history of nodes or arcs traversed in each path may be maintained by a separate pool of history servers, and used to generate text corresponding to the path identified as the best match by the speech recognition servers.

Abstract translation: 公开了用于在池化执行资源之间传送语音识别工作负载的特征。例如，自动语音识别引擎的各个部分可以由各种服务器池来实现。语音识别池中的服务器可以在图中探索多个路径，以找到与话语最匹配的路径。根据每个服务器上的资源可用性，包括在每个路径中探索的最后一个节点的一组活动节点可以在池中的服务器之间传送。每个路径中遍历的节点或弧的历史可以由单独的历史服务器池维护，并且用于生成与通过语音识别服务器识别为最佳匹配的路径相对应的文本。

10.

发明授权
Speech model retrieval in distributed speech recognition systems 有权
Title translation: 分布式语音识别系统中的语音模型检索

公开(公告)号：US09190057B2

公开(公告)日：2015-11-17

申请号：US13712891

申请日：2012-12-12

Applicant: Amazon Technologies, Inc.

Inventor： Bjorn Hoffmeister , Hugh Evan Secker-Walker , Jeffrey Cornelius O'Neill

IPC: G10L15/00 , G10L15/16 , G10L15/22 , G10L15/32 , G10L15/30

CPC classification number: G10L15/32 , G10L15/22 , G10L15/30

Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

Abstract translation: 公开了用于管理语音识别模型和自动语音识别系统中的数据的使用的特征。可以异步检索模型和数据，并在收到文字或使用更为一般或不同的模型对话语进行初始处理之后进行使用。一旦收到，模型和统计信息可以被缓存。还可以异步检索更新模型和数据所需的统计数据，以便可以在模型和数据可用时更新模型和数据。可以立即使用更新的模型和数据来重新处理话语，或者保存用于处理随后接收的话语。可以跟踪与自动语音识别系统的用户交互，以便预测用户什么时候可能利用该系统。基于这样的预测，模型和数据可以被预先缓存。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification