Error reduction in speech processing

    公开(公告)号:US09697827B1

    公开(公告)日:2017-07-04

    申请号:US13711478

    申请日:2012-12-11

    CPC classification number: G10L15/18 G10L15/14 G10L15/19

    Abstract: Features are disclosed for reducing errors in speech recognition processing. Methods for reducing errors can include receiving multiple speech recognition hypotheses based on an utterance indicative of a command or query of a user and determining a command or query within a grammar having a least amount of difference from one of the speech recognition hypotheses. The determination of the least amount of difference may be based at least in part on a comparison of individual subword units along at least some of the sequence paths of the speech recognition hypotheses and the grammar. For example, the comparison may be performed on the phoneme level instead of the word level.

    SPEECH RECOGNIZER WITH MULTI-DIRECTIONAL DECODING
    2.
    发明申请
    SPEECH RECOGNIZER WITH MULTI-DIRECTIONAL DECODING 有权
    具有多方向解码的语音识别器

    公开(公告)号:US20150095026A1

    公开(公告)日:2015-04-02

    申请号:US14039383

    申请日:2013-09-27

    Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to process speech based on multiple channels of audio received from a beamformer. The ASR processing system may include a microphone array and the beamformer to output multiple channels of audio such that each channel isolates audio in a particular direction. The multichannel audio signals may include spoken utterances/speech from one or more speakers as well as undesired audio, such as noise from a household appliance. The ASR device may simultaneously perform speech recognition on the multi-channel audio to provide more accurate speech recognition results.

    Abstract translation: 在自动语音识别(ASR)处理系统中,ASR处理可以被配置为基于从波束形成器接收的多个音频信道来处理语音。 ASR处理系统可以包括麦克风阵列和波束形成器以输出多个音频通道,使得每个通道在特定方向上隔离音频。 多声道音频信号可以包括来自一个或多个扬声器的说话话音/语音以及不期望的音频,例如来自家用电器的噪声。 ASR设备可以同时对多声道音频执行语音识别,以提供更准确的语音识别结果。

    Architecture for multi-domain natural language processing

    公开(公告)号:US11176936B2

    公开(公告)日:2021-11-16

    申请号:US16400905

    申请日:2019-05-01

    Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.

    System for recognizing and responding to environmental noises

    公开(公告)号:US10424292B1

    公开(公告)日:2019-09-24

    申请号:US13830222

    申请日:2013-03-14

    Abstract: An audio controlled assistant captures environmental noise and converts the environmental noise into audio signals. The audio signals are provided to a system which analyzes the audio signals for a plurality of audio prompts, which have been customized for the acoustic environment surrounding the audio controlled assistant by an acoustic modeling system. The system configured to detect the presence of an audio prompt in the audio signals and transmit instructions associated with the detected audio prompt to at least one of the audio controlled assistant or one or more cloud based services, in response.

    Architecture for multi-domain natural language processing
    5.
    发明授权
    Architecture for multi-domain natural language processing 有权
    多域自然语言处理架构

    公开(公告)号:US09436678B2

    公开(公告)日:2016-09-06

    申请号:US14754598

    申请日:2015-06-29

    Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.

    Abstract translation: 公开了用于处理关于多个主题或域的用户话语的特征,并且从用于响应于话语或以其他方式采取行动的特定域中选择可能的结果。 用户话语可以通过自动语音识别(“ASR”)模块进行转录,并且可以将结果提供给多域自然语言理解(“NLU”)引擎。 多域NLU引擎可以处理多个单个域中的转录,而不是在单个域中处理转录。 在一些情况下,转录可以在多个单独的结构域中并行或基本同时地进行处理。 此外,可以基于先前的用户交互和其他数据生成提示。 ASR模块,多域NLU引擎和口语处理系统的其他组件可以使用提示来更有效地处理输入或更准确地生成输出。

    Architecture for multi-domain natural language processing

    公开(公告)号:US10283119B2

    公开(公告)日:2019-05-07

    申请号:US15966400

    申请日:2018-04-30

    Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.

    ARCHITECTURE FOR MULTI-DOMAIN NATURAL LANGUAGE PROCESSING

    公开(公告)号:US20220148590A1

    公开(公告)日:2022-05-12

    申请号:US17454716

    申请日:2021-11-12

    Abstract: Features are disclosed for processing a user utterance with respect to multiple subject matters or domains, and for selecting a likely result from a particular domain with which to respond to the utterance or otherwise take action. A user utterance may be transcribed by an automatic speech recognition (“ASR”) module, and the results may be provided to a multi-domain natural language understanding (“NLU”) engine. The multi-domain NLU engine may process the transcription(s) in multiple individual domains rather than in a single domain. In some cases, the transcription(s) may be processed in multiple individual domains in parallel or substantially simultaneously. In addition, hints may be generated based on previous user interactions and other data. The ASR module, multi-domain NLU engine, and other components of a spoken language processing system may use the hints to more efficiently process input or more accurately generate output.

Patent Agency Ranking