Intent-specific automatic speech recognition result generation

    公开(公告)号:US10811013B1

    公开(公告)日:2020-10-20

    申请号:US14137563

    申请日:2013-12-20

    Abstract: Features are disclosed for generating intent-specific results in an automatic speech recognition system. The results can be generated by utilizing a decoding graph containing tags that identify portions of the graph corresponding to a given intent. The tags can also identify high-information content slots and low-information carrier phrases for a given intent. The automatic speech recognition system may utilize these tags to provide a semantic representation based on a plurality of different tokens for the content slot portions and low information for the carrier portions. A user can be presented with a user interface containing top intent results with corresponding intent-specific top content slot values.

    SPEECH MODEL RETRIEVAL IN DISTRIBUTED SPEECH RECOGNITION SYSTEMS
    2.
    发明申请
    SPEECH MODEL RETRIEVAL IN DISTRIBUTED SPEECH RECOGNITION SYSTEMS 审中-公开
    分布式语音识别系统中的语音模型检索

    公开(公告)号:US20160071519A1

    公开(公告)日:2016-03-10

    申请号:US14942551

    申请日:2015-11-16

    CPC classification number: G10L15/32 G10L15/22 G10L15/30

    Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

    Abstract translation: 公开了用于管理语音识别模型和自动语音识别系统中的数据的使用的特征。 可以异步检索模型和数据,并在收到文字或使用更为一般或不同的模型对话语进行初始处理之后进行使用。 一旦收到,模型和统计信息可以被缓存。 还可以异步检索更新模型和数据所需的统计数据,以便可以在模型和数据可用时更新模型和数据。 可以立即使用更新的模型和数据来重新处理话语,或者保存用于处理随后接收的话语。 可以跟踪与自动语音识别系统的用户交互,以便预测用户什么时候可能利用该系统。 基于这样的预测,模型和数据可以被预先缓存。

    Incremental utterance processing and semantic stability determination

    公开(公告)号:US10102851B1

    公开(公告)日:2018-10-16

    申请号:US14012262

    申请日:2013-08-28

    Abstract: Incremental speech recognition results are generated and used to determine a user's intent from an utterance. Utterance audio data may be partitioned into multiple portions, and incremental speech recognition results may be generated from one or more of the portions. A natural language understanding module or some other language processing module can generate semantic representations of the utterance from the incremental speech recognition results. Stability of the determined intent may be determined over the course of time, and actions may be taken in response to meeting certain stability thresholds.

    Reducing speech recognition latency
    4.
    发明授权
    Reducing speech recognition latency 有权
    降低语音识别延迟

    公开(公告)号:US09514747B1

    公开(公告)日:2016-12-06

    申请号:US14011898

    申请日:2013-08-28

    CPC classification number: G10L15/08 G10L25/60 G10L2015/085

    Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to reduce a latency of returning speech results to a user. The latency may be determined by comparing a time stamp of an utterance in process to a current time. Latency may also be estimated based on an endpoint of the utterance or other considerations such as how difficult the utterance may be to process. To improve latency the ASR system may be configured to adjust various processing parameters, such as graph pruning factors, path weights, ASR models, etc. Latency checks and corrections may occur dynamically for a particular utterance while it is being processed, thus allowing the ASR system to adjust to rapidly changing latency conditions.

    Abstract translation: 在自动语音识别(ASR)处理系统中,ASR处理可以被配置为减少向用户返回语音结果的等待时间。 可以通过将处理中的话语的时间戳与当前时间进行比较来确定等待时间。 延迟也可以基于话语的终点或其他考虑来估计,例如话语可能难以处理。 为了改善延迟,ASR系统可以被配置为调整各种处理参数,例如图形剪枝因子,路径权重,ASR模型等。在正在处理的情况下,潜在检查和校正可以针对特定话语动态地发生,从而允许ASR 系统调整到快速变化的潜伏期条件。

    Generation and use of multiple speech processing transforms
    5.
    发明授权
    Generation and use of multiple speech processing transforms 有权
    多语音处理转换的生成与使用

    公开(公告)号:US09218806B1

    公开(公告)日:2015-12-22

    申请号:US13892167

    申请日:2013-05-10

    CPC classification number: G10L15/02 G10L15/30 G10L15/32

    Abstract: Features are disclosed for selecting and using multiple transforms associated with a particular remote device for use in automatic speech recognition (“ASR”). Each transform may be based on statistics that have been generated from processing utterances that share some characteristic (e.g., acoustic characteristics, time frame within which the utterances where processed, etc.). When an utterance is received from the remote device, a particular transform or set of transforms may be selected for use in speech processing based on data obtained from the remote device, speech processing of a portion of the utterance, speech processing of prior utterances, etc. The transform or transforms used in processing the utterances may then be updated based on the results of the speech processing.

    Abstract translation: 公开了用于选择和使用与特定远程设备相关联的用于自动语音识别(“ASR”)的多个变换的特征。 每个变换可以基于已经从共享一些特征的处理话语(例如,声学特性,其中处理的话语的时间框架等)产生的统计信息。 当从远程设备接收到话语时,可以基于从远程设备获得的数据,话音的一部分的语音处理,先前语音的语音处理等,选择特定的变换或变换集合用于语音处理 然后可以基于语音处理的结果更新用于处理话语的变换或变换。

    Load balancing for automatic speech recognition
    9.
    发明授权
    Load balancing for automatic speech recognition 有权
    用于自动语音识别的负载平衡

    公开(公告)号:US09269355B1

    公开(公告)日:2016-02-23

    申请号:US13831286

    申请日:2013-03-14

    CPC classification number: G10L15/30

    Abstract: Features are disclosed for transferring speech recognition workloads between pooled execution resources. For example, various parts of an automatic speech recognition engine may be implemented by various pools of servers. Servers in a speech recognition pool may explore a plurality of paths in a graph to find the path that best matches an utterance. A set of active nodes comprising the last node explored in each path may be transferred between servers in the pool depending on resource availability at each server. A history of nodes or arcs traversed in each path may be maintained by a separate pool of history servers, and used to generate text corresponding to the path identified as the best match by the speech recognition servers.

    Abstract translation: 公开了用于在池化执行资源之间传送语音识别工作负载的特征。 例如,自动语音识别引擎的各个部分可以由各种服务器池来实现。 语音识别池中的服务器可以在图中探索多个路径,以找到与话语最匹配的路径。 根据每个服务器上的资源可用性,包括在每个路径中探索的最后一个节点的一组活动节点可以在池中的服务器之间传送。 每个路径中遍历的节点或弧的历史可以由单独的历史服务器池维护,并且用于生成与通过语音识别服务器识别为最佳匹配的路径相对应的文本。

    Speech model retrieval in distributed speech recognition systems
    10.
    发明授权
    Speech model retrieval in distributed speech recognition systems 有权
    分布式语音识别系统中的语音模型检索

    公开(公告)号:US09190057B2

    公开(公告)日:2015-11-17

    申请号:US13712891

    申请日:2012-12-12

    CPC classification number: G10L15/32 G10L15/22 G10L15/30

    Abstract: Features are disclosed for managing the use of speech recognition models and data in automated speech recognition systems. Models and data may be retrieved asynchronously and used as they are received or after an utterance is initially processed with more general or different models. Once received, the models and statistics can be cached. Statistics needed to update models and data may also be retrieved asynchronously so that it may be used to update the models and data as it becomes available. The updated models and data may be immediately used to re-process an utterance, or saved for use in processing subsequently received utterances. User interactions with the automated speech recognition system may be tracked in order to predict when a user is likely to utilize the system. Models and data may be pre-cached based on such predictions.

    Abstract translation: 公开了用于管理语音识别模型和自动语音识别系统中的数据的使用的特征。 可以异步检索模型和数据,并在收到文字或使用更为一般或不同的模型对话语进行初始处理之后进行使用。 一旦收到,模型和统计信息可以被缓存。 还可以异步检索更新模型和数据所需的统计数据,以便可以在模型和数据可用时更新模型和数据。 可以立即使用更新的模型和数据来重新处理话语,或者保存用于处理随后接收的话语。 可以跟踪与自动语音识别系统的用户交互,以便预测用户什么时候可能利用该系统。 基于这样的预测,模型和数据可以被预先缓存。

Patent Agency Ranking