-
公开(公告)号:US20160379632A1
公开(公告)日:2016-12-29
申请号:US14753811
申请日:2015-06-29
Applicant: Amazon Technologies, Inc.
Inventor: Bjorn Hoffmeister , Ariya Rastrow , Baiyang Liu
CPC classification number: G10L15/22 , G10L15/18 , G10L15/183 , G10L15/26 , G10L25/87 , G10L25/93 , G10L2025/783
Abstract: An automatic speech recognition (ASR) system detects an endpoint of an utterance using the active hypotheses under consideration by a decoder. The ASR system calculates the amount of non-speech detected by a plurality of hypotheses and weights the non-speech duration by the probability of each hypotheses. When the aggregate weighted non-speech exceeds a threshold, an endpoint may be declared.
Abstract translation: 自动语音识别(ASR)系统使用解码器考虑的活动假设来检测话音的端点。 ASR系统计算由多个假设检测到的非语音量,并以每个假设的概率对非语音持续时间加权。 当聚合加权非语音超过阈值时,可以声明端点。
-
公开(公告)号:US09437186B1
公开(公告)日:2016-09-06
申请号:US13921671
申请日:2013-06-19
Applicant: Amazon Technologies, Inc.
Inventor: Baiyang Liu , Hugh Evan Secker-Walker , Alexander David Rosen
CPC classification number: G10L15/05 , G10L15/00 , G10L15/1815 , G10L15/19 , G10L15/22 , G10L25/78 , G10L2015/223
Abstract: Determining the end of an utterance for purposes of automatic speech recognition (ASR) may be improved with a system that provides early results and/or incorporates semantic tagging. Early ASR results of an incoming utterance may be prepared based at least in part on an estimated endpoint and processed by a natural language understanding (NLU) process while final results, based at least in part on a final endpoint, are determined. If the early results match the final results, the early NLU results are already prepared for early execution. The endpoint may also be determined based at least in part on the content of the utterance, as represented by semantic tagging output from ASR processing. If the tagging indicate completion of a logical statement, an endpoint may be declared, or a threshold for silent frames prior to declaring an endpoint may be adjusted.
Abstract translation: 用于自动语音识别(ASR)的话语的确定结束可以通过提供早期结果和/或包含语义标签的系统来改进。 可以至少部分地基于估计的端点并且由自然语言理解(NLU)过程进行处理来准备传入话语的早期ASR结果,而至少部分地基于最终端点确定最终结果。 如果早期结果符合最终结果,则早期NLU结果已经准备好提前执行。 还可以至少部分地基于话音的内容来确定端点,如ASR处理的语义标签输出所表示的。 如果标记指示逻辑语句的完成,则可以声明端点,或者可以调整在声明端点之前的静默帧的阈值。
-