System and method for generating user models from transcribed dialogs

    公开(公告)号:US09454966B2

    公开(公告)日:2016-09-27

    申请号:US13926552

    申请日:2013-06-25

    CPC classification number: G10L15/265 G10L15/07 G10L2015/0631

    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for generating personalized user models. The method includes receiving automatic speech recognition (ASR) output of speech interactions with a user, receiving an ASR transcription error model characterizing how ASR transcription errors are made, generating guesses of a true transcription and a user model via an expectation maximization (EM) algorithm based on the error model and the respective ASR output where the guesses will converge to a personalized user model which maximizes the likelihood of the ASR output. The ASR output can be unlabeled. The method can include casting speech interactions as a dynamic Bayesian network with four variables: (s), (u), (r), (m), and encoding relationships between (s), (u), (r), (m) as conditional probability tables. At each dialog turn (r) and (m) are known and (s) and (u) are hidden.

    System and Method for Efficient Tracking of Multiple Dialog States with Incremental Recombination
    4.
    发明申请
    System and Method for Efficient Tracking of Multiple Dialog States with Incremental Recombination 有权
    有效跟踪增量重组的多个对话状态的系统和方法

    公开(公告)号:US20130268274A1

    公开(公告)日:2013-10-10

    申请号:US13909409

    申请日:2013-06-04

    Inventor: Jason Williams

    CPC classification number: G10L15/04 G10L15/14 G10L15/22

    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for tracking multiple dialog states. A system practicing the method receives an N-best list of speech recognition candidates, a list of current partitions, and a belief for each of the current partitions. A partition is a group of dialog states. In an outer loop, the system iterates over the N-best list of speech recognition candidates. In an inner loop, the system performs a split, update, and recombination process to generate a fixed number of partitions after each speech recognition candidate in the N-best list. The system recognizes speech based on the N-best list and the fixed number of partitions. The split process can perform all possible splits on all partitions. The update process can compute an estimated new belief. The estimated new belief can be a product of ASR reliability, user likelihood to produce this action, and an original belief.

    Abstract translation: 本文公开了用于跟踪多个对话状态的系统,方法和计算机可读存储介质。 实施该方法的系统接收N个最佳语音识别候选列表,当前分区的列表以及每个当前分区的置信度。 分区是一组对话状态。 在外部循环中,系统迭代N个最佳语音识别候选列表。 在内循环中,系统执行拆分,更新和重组过程,以在N最佳列表中的每个语音识别候选之后生成固定数量的分区。 系统基于N最佳列表和固定数量的分区识别语音。 拆分过程可以在所有分区上执行所有可能的拆分。 更新过程可以计算估计的新信念。 估计新的信念可能是ASR可靠性的产物,用户产生这种行为的可能性以及原始信念。

    System and method for efficient tracking of multiple dialog states with incremental recombination
    6.
    发明授权
    System and method for efficient tracking of multiple dialog states with incremental recombination 有权
    用增量复合有效跟踪多个对话状态的系统和方法

    公开(公告)号:US08700402B2

    公开(公告)日:2014-04-15

    申请号:US13909409

    申请日:2013-06-04

    Inventor: Jason Williams

    CPC classification number: G10L15/04 G10L15/14 G10L15/22

    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for tracking multiple dialog states. A system practicing the method receives an N-best list of speech recognition candidates, a list of current partitions, and a belief for each of the current partitions. A partition is a group of dialog states. In an outer loop, the system iterates over the N-best list of speech recognition candidates. In an inner loop, the system performs a split, update, and recombination process to generate a fixed number of partitions after each speech recognition candidate in the N-best list. The system recognizes speech based on the N-best list and the fixed number of partitions. The split process can perform all possible splits on all partitions. The update process can compute an estimated new belief. The estimated new belief can be a product of ASR reliability, user likelihood to produce this action, and an original belief.

    Abstract translation: 本文公开了用于跟踪多个对话状态的系统,方法和计算机可读存储介质。 实施该方法的系统接收N个最佳语音识别候选列表,当前分区的列表以及每个当前分区的置信度。 分区是一组对话状态。 在外部循环中,系统迭代N个最佳语音识别候选列表。 在内循环中,系统执行分割,更新和重组过程,以在N最佳列表中的每个语音识别候选之后生成固定数量的分区。 系统基于N最佳列表和固定数量的分区识别语音。 拆分过程可以在所有分区上执行所有可能的拆分。 更新过程可以计算估计的新信念。 估计新的信念可能是ASR可靠性的产物,用户产生这种行为的可能性以及原始信念。

    System and Method for Generating User Models From Transcribed Dialogs
    7.
    发明申请
    System and Method for Generating User Models From Transcribed Dialogs 有权
    从转录对话框生成用户模型的系统和方法

    公开(公告)号:US20130289985A1

    公开(公告)日:2013-10-31

    申请号:US13926552

    申请日:2013-06-25

    CPC classification number: G10L15/265 G10L15/07 G10L2015/0631

    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for generating personalized user models. The method includes receiving automatic speech recognition (ASR) output of speech interactions with a user, receiving an ASR transcription error model characterizing how ASR transcription errors are made, generating guesses of a true transcription and a user model via an expectation maximization (EM) algorithm based on the error model and the respective ASR output where the guesses will converge to a personalized user model which maximizes the likelihood of the ASR output. The ASR output can be unlabeled. The method can include casting speech interactions as a dynamic Bayesian network with four variables: (s), (u), (r), (m), and encoding relationships between (s), (u), (r), (m) as conditional probability tables. At each dialog turn (r) and (m) are known and (s) and (u) are hidden.

    Abstract translation: 这里公开了用于生成个性化用户模型的系统,计算机实现的方法和计算机可读存储介质。 该方法包括接收与用户的语音交互的自动语音识别(ASR)输出,接收表征如何进行ASR转录错误的ASR转录错误模型,通过期望最大化(EM)算法产生真实转录的猜测和用户模型 基于错误模型和相应的ASR输出,其中猜测将会聚合到使ASR输出的可能性最大化的个性化用户模型。 ASR输出可以不标记。 该方法可以包括将语音交互作为动态贝叶斯网络,具有四个变量:(s),(u),(r),(m)以及(s),(u),(r), )作为条件概率表。 在每个对话中,转(r)和(m)是已知的,(s)和(u)被隐藏。

    Incremental speech recognition for dialog systems
    9.
    发明授权
    Incremental speech recognition for dialog systems 有权
    对话系统的增量语音识别

    公开(公告)号:US09015048B2

    公开(公告)日:2015-04-21

    申请号:US13691005

    申请日:2012-11-30

    CPC classification number: G10L15/1822

    Abstract: A system and method for integrating incremental speech recognition in dialog systems. An example system configured to practice the method receives incremental speech recognition results of user speech as part of a dialog with a user, and copies a dialog manager operating on the user speech to generate temporary instances of the dialog manager. Then the system evaluates actions the temporary instances of the dialog manager would take based on the incremental speech recognition results, and identifies an action that would advance the dialog and a corresponding temporary instance of the dialog manager. The system can then execute the action in the dialog and optionally replace the dialog manager with the corresponding temporary instance of the dialog manager. The action can include making a turn-taking decision in the dialog, such as whether, what, and when to speak or whether to be silent.

    Abstract translation: 一种用于在对话系统中集成增量语音识别的系统和方法。 配置为实施该方法的示例系统接收用户语音的增量语音识别结果作为与用户的对话的一部分,并且复制在用户语音上操作的对话管理器以生成对话管理器的临时实例。 然后,系统基于增量语音识别结果来评估对话管理器的临时实例将采取的操作,并且识别将推进对话框的操作和对话管理器的相应临时实例。 然后,系统可以在对话框中执行操作,并可选择将对话管理器替换为对话管理器的相应临时实例。 该行动可以包括在对话中做出转向决定,例如是否,什么,什么时候说话,还是沉默。

Patent Agency Ranking