Patent search ap:("Amazon Technologies Page Inc.") AND inv:"Alexander David Rosen"

1.

发明授权
Reducing speech recognition latency 有权
Title translation: 降低语音识别延迟

公开(公告)号：US09514747B1

公开(公告)日：2016-12-06

申请号：US14011898

申请日：2013-08-28

Applicant: Amazon Technologies, Inc.

Inventor： Michael Maximilian Emanuel Bisani , Hugh Evan Secker-Walker , Kenneth John Basye , Alexander David Rosen

IPC: G10L21/00 , G10L25/93 , G10L15/00 , G10L15/26 , G10L17/00 , G10L15/04 , G10L25/00 , G10L15/22

CPC classification number: G10L15/08 , G10L25/60 , G10L2015/085

Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to reduce a latency of returning speech results to a user. The latency may be determined by comparing a time stamp of an utterance in process to a current time. Latency may also be estimated based on an endpoint of the utterance or other considerations such as how difficult the utterance may be to process. To improve latency the ASR system may be configured to adjust various processing parameters, such as graph pruning factors, path weights, ASR models, etc. Latency checks and corrections may occur dynamically for a particular utterance while it is being processed, thus allowing the ASR system to adjust to rapidly changing latency conditions.

Abstract translation: 在自动语音识别（ASR）处理系统中，ASR处理可以被配置为减少向用户返回语音结果的等待时间。可以通过将处理中的话语的时间戳与当前时间进行比较来确定等待时间。延迟也可以基于话语的终点或其他考虑来估计，例如话语可能难以处理。为了改善延迟，ASR系统可以被配置为调整各种处理参数，例如图形剪枝因子，路径权重，ASR模型等。在正在处理的情况下，潜在检查和校正可以针对特定话语动态地发生，从而允许ASR 系统调整到快速变化的潜伏期条件。

2.

发明授权
Enhanced endpoint detection for speech recognition 有权
Title translation: 增强的语音识别端点检测

公开(公告)号：US09437186B1

公开(公告)日：2016-09-06

申请号：US13921671

申请日：2013-06-19

Applicant: Amazon Technologies, Inc.

Inventor： Baiyang Liu , Hugh Evan Secker-Walker , Alexander David Rosen

IPC: G10L15/00 , G10L15/05 , G10L15/22 , G10L15/19

CPC classification number: G10L15/05 , G10L15/00 , G10L15/1815 , G10L15/19 , G10L15/22 , G10L25/78 , G10L2015/223

Abstract: Determining the end of an utterance for purposes of automatic speech recognition (ASR) may be improved with a system that provides early results and/or incorporates semantic tagging. Early ASR results of an incoming utterance may be prepared based at least in part on an estimated endpoint and processed by a natural language understanding (NLU) process while final results, based at least in part on a final endpoint, are determined. If the early results match the final results, the early NLU results are already prepared for early execution. The endpoint may also be determined based at least in part on the content of the utterance, as represented by semantic tagging output from ASR processing. If the tagging indicate completion of a logical statement, an endpoint may be declared, or a threshold for silent frames prior to declaring an endpoint may be adjusted.

Abstract translation: 用于自动语音识别（ASR）的话语的确定结束可以通过提供早期结果和/或包含语义标签的系统来改进。可以至少部分地基于估计的端点并且由自然语言理解（NLU）过程进行处理来准备传入话语的早期ASR结果，而至少部分地基于最终端点确定最终结果。如果早期结果符合最终结果，则早期NLU结果已经准备好提前执行。还可以至少部分地基于话音的内容来确定端点，如ASR处理的语义标签输出所表示的。如果标记指示逻辑语句的完成，则可以声明端点，或者可以调整在声明端点之前的静默帧的阈值。

3.

发明授权
Methods and devices for ignoring similar audio being received by a system 有权

公开(公告)号：US09728188B1

公开(公告)日：2017-08-08

申请号：US15195587

申请日：2016-06-28

Applicant: Amazon Technologies, Inc.

Inventor： Alexander David Rosen , Michael James Rodehorst , George Jay Tucker , Aaron Lee Mathers Challenner

IPC: G10L15/22 , G10L15/08 , G10L25/51 , G10L15/28 , G10L19/08

CPC classification number: G10L15/22 , G10L19/08 , G10L25/18 , G10L25/51 , G10L2015/223

Abstract: Systems and methods for detecting similar audio being received by separate voice activated electronic devices, and ignoring those commands, is described herein. In some embodiments, a voice activated electronic device may be activated by a wakeword that is output by the additional electronic device, such as a television or radio, may capture audio of sound subsequently following the wakeword, and may send audio data representing the sound to a backend system. Upon receipt, the backend system may, in parallel to performing automated speech recognition processing to the audio data, generate a sound profile of the audio data, and may compare that sound profile to sound profiles of recently received audio data and/or flagged sound profiles. If the generated sound profile is determined to match another sound profiles, then the automated speech recognition processing may be stopped, and the voice activated electronic device may be instructed to return to a keyword spotting mode. If the matching sound profile is not already stored in a database of known sound profiles, it can be stored for future comparisons.

4.

发明授权
Command suggestions during automatic speech recognition 有权
Title translation: 自动语音识别期间的命令建议

公开(公告)号：US09378740B1

公开(公告)日：2016-06-28

申请号：US14502572

申请日：2014-09-30

Applicant: Amazon Technologies, Inc.

Inventor： Alexander David Rosen , Yuwang Yin

IPC: G10L15/26 , G10L15/18

CPC classification number: G10L15/1822 , G06F17/3097 , G10L2015/223 , G10L2015/228

Abstract: Features are disclosed for identifying and providing command suggestions during automatic speech recognition. As utterances are interpreted, suggestions may be provided based on even partial interpretations to guide users of a client device to commands available via speech recognition.

Abstract translation: 公开了在自动语音识别期间识别和提供命令建议的特征。当解释话语时，可以基于甚至部分解释来提供建议，以指导客户端设备的用户通过语音识别获得可用命令。

5.

发明授权
Sound profile generation based on speech recognition results exceeding a threshold 有权

公开(公告)号：US10074364B1

公开(公告)日：2018-09-11

申请号：US15085772

申请日：2016-03-30

Applicant: Amazon Technologies, Inc.

Inventor： Colin Wills Wightman , Naresh Narayanan , Alexander David Rosen , Michael James Rodehorst , Daniel Robert Rashid

IPC: G10L15/00 , G10L15/06 , G10L15/20 , G10L15/10 , G10L17/04 , G10L15/26 , G06F17/27 , G10L15/22

CPC classification number: G10L15/20 , G06F17/2775 , G10L15/10 , G10L15/26 , G10L15/265 , G10L17/04 , G10L25/51 , G10L2015/223

Abstract: Systems and methods for generating sound profiles of artificial commands detected by multiple voice activated electronic devices is described herein. In some embodiments, numerous voice activated electronic devices may send audio data representing a phrase to a backend system at a substantially same time. Text data representing the phrase, and counts for instances of that text data, may be generated. If the number of counts exceeds a predefined threshold, the backend system may cause any remaining response generation functionality that particular command that is in excess of the predefined threshold to be stopped, and those devices returned to a sleep state. In some embodiments, a sound profile unique to the phrase that caused the excess of the predefined threshold may be generated such that future instances of the same phrase may be recognized prior to text data being generated, conserving the backend system's resources.

6.

发明授权
Customized speech processing language models 有权

公开(公告)号：US09934777B1

公开(公告)日：2018-04-03

申请号：US15248211

申请日：2016-08-26

Applicant: Amazon Technologies, Inc.

Inventor： Shaun Nidhiri Joseph , Sonal Pareek , Ariya Rastrow , Gautam Tiwari , Alexander David Rosen

IPC: G10L15/00 , G10L15/06 , G10L15/193 , G10L15/02 , G10L15/08 , G10L15/22 , G10L15/30 , G10L15/18

CPC classification number: G10L15/063 , G10L15/02 , G10L15/08 , G10L15/1815 , G10L15/193 , G10L15/22 , G10L15/30 , G10L2015/025 , G10L2015/0635

Abstract: User-specific language models (LMs) that include internal word indexes to a word table specific to the user-specific LM rather than a word table specific to a system-wide LM. When the system-wide LM is updated, the word table of the user-specific LM may be updated to translate the user-specific indices to system-wide indices. This prevents having to update the internal indices of the user-specific LM every time the system-wide LM is updated.

7.

发明授权
Dynamic pruning in speech recognition 有权

公开(公告)号：US09613624B1

公开(公告)日：2017-04-04

申请号：US14314563

申请日：2014-06-25

Applicant: Amazon Technologies, Inc.

Inventor： Jake Simon Kramer , Alexander David Rosen , Kenneth John Basye

IPC: G10L15/00 , G10L15/22 , G10L19/00 , G10L15/04

CPC classification number: G10L15/08 , G10L2015/085

Abstract: In a dynamic automatic speech recognition (ASR) processing system, ASR processing may be configured to estimate a latency of returning speech results to a user based on work being done by an ASR processor. The ASR processing system may measure work done by an ASR processor by measuring one or more time independent metrics and comparing the metrics to threshold values. If the metrics exceed the thresholds, the ASR system may take steps to reduce latency associated with processing the utterance, including adjusting a speech recognition parameter.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification