INCREMENTAL SPEECH RECOGNITION FOR DIALOG SYSTEMS
    1.
    发明申请
    INCREMENTAL SPEECH RECOGNITION FOR DIALOG SYSTEMS 有权
    对话系统的增强语音识别

    公开(公告)号:US20140156268A1

    公开(公告)日:2014-06-05

    申请号:US13691005

    申请日:2012-11-30

    CPC classification number: G10L15/1822

    Abstract: A system and method for integrating incremental speech recognition in dialog systems. An example system configured to practice the method receives incremental speech recognition results of user speech as part of a dialog with a user, and copies a dialog manager operating on the user speech to generate temporary instances of the dialog manager. Then the system evaluates actions the temporary instances of the dialog manager would take based on the incremental speech recognition results, and identifies an action that would advance the dialog and a corresponding temporary instance of the dialog manager. The system can then execute the action in the dialog and optionally replace the dialog manager with the corresponding temporary instance of the dialog manager. The action can include making a turn-taking decision in the dialog, such as whether, what, and when to speak or whether to be silent.

    Abstract translation: 一种用于在对话系统中集成增量语音识别的系统和方法。 配置为实施该方法的示例系统接收用户语音的增量语音识别结果作为与用户的对话的一部分,并且复制在用户语音上操作的对话管理器以生成对话管理器的临时实例。 然后,系统基于增量语音识别结果来评估对话管理器的临时实例将采取的操作,并且识别将推进对话框的操作和对话管理器的相应临时实例。 然后,系统可以在对话框中执行操作,并可选择将对话管理器替换为对话管理器的相应临时实例。 该行动可以包括在对话中做出转向决定,例如是否,什么,什么时候说话,还是沉默。

    SYSTEM AND METHOD FOR ADVANCED TURN-TAKING INTERACTIVE SPOKEN DIALOG SYSTEMS
    2.
    发明申请
    SYSTEM AND METHOD FOR ADVANCED TURN-TAKING INTERACTIVE SPOKEN DIALOG SYSTEMS 审中-公开
    用于高级交互式对讲机对话系统的系统和方法

    公开(公告)号:US20160300572A1

    公开(公告)日:2016-10-13

    申请号:US15190325

    申请日:2016-06-23

    CPC classification number: G10L15/222 G10L15/04 G10L15/05 G10L15/063 G10L15/083

    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system. A system configured according to this disclosure can incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions. A first condition which, if found, allows the system to communicate partial speech recognition results, is that the most recent word found in the partial results is statistically likely to be the termination of the utterance, also known as a terminal node. A second condition is the determination that all search paths within a speech lattice converge to a common node, also known as a pinch node, before branching out again. Upon finding either condition, the system can communicate the partial speech recognition results. Stability and correctness probabilities can also determine which partial results are communicated.

    Abstract translation: 这里公开了用于交互式口语对话系统中的高级转弯的系统,方法和非暂时的计算机可读存储介质。 根据本公开配置的系统可以在完成语音话语之前递增地处理语音,并且可以在发现特定条件时传送部分语音识别结果。 如果发现,允许系统传达部分语音识别结果的第一个条件是,在部分结果中发现的最新字词在统计上可能是话语的终止,也称为终端节点。 第二个条件是确定在再次分支之前,语音网格内的所有搜索路径都会收敛到公共节点,也称为收缩节点。 在找到任一条件后,系统可以传达部分语音识别结果。 稳定性和正确性概率也可以确定传达哪些部分结果。

    SYSTEM AND METHOD FOR ADVANCED TURN-TAKING FOR INTERACTIVE SPOKEN DIALOG SYSTEMS
    3.
    发明申请
    SYSTEM AND METHOD FOR ADVANCED TURN-TAKING FOR INTERACTIVE SPOKEN DIALOG SYSTEMS 有权
    用于交互式对讲机系统的高级转接系统和方法

    公开(公告)号:US20150100316A1

    公开(公告)日:2015-04-09

    申请号:US14565516

    申请日:2014-12-10

    CPC classification number: G10L15/222 G10L15/04 G10L15/05 G10L15/063 G10L15/083

    Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system. A system configured according to this disclosure can incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions. A first condition which, if found, allows the system to communicate partial speech recognition results, is that the most recent word found in the partial results is statistically likely to be the termination of the utterance, also known as a terminal node. A second condition is the determination that all search paths within a speech lattice converge to a common node, also known as a pinch node, before branching out again. Upon finding either condition, the system can communicate the partial speech recognition results. Stability and correctness probabilities can also determine which partial results are communicated.

    Abstract translation: 这里公开了用于交互式口语对话系统中的高级转弯的系统,方法和非暂时的计算机可读存储介质。 根据本公开配置的系统可以在完成语音话语之前递增地处理语音,并且可以在发现特定条件时传送部分语音识别结果。 如果发现,允许系统传达部分语音识别结果的第一个条件是,在部分结果中发现的最新字词在统计上可能是话语的终止,也称为终端节点。 第二个条件是确定在再次分支之前,语音网格内的所有搜索路径都会收敛到公共节点,也称为收缩节点。 在找到任一条件后,系统可以传达部分语音识别结果。 稳定性和正确性概率也可以确定传达哪些部分结果。

    SYSTEM AND METHOD FOR MULTI-AGENT ARCHITECTURE FOR INTERACTIVE MACHINES
    5.
    发明申请
    SYSTEM AND METHOD FOR MULTI-AGENT ARCHITECTURE FOR INTERACTIVE MACHINES 有权
    用于交互式机器的多代理架构的系统和方法

    公开(公告)号:US20160063992A1

    公开(公告)日:2016-03-03

    申请号:US14473288

    申请日:2014-08-29

    Inventor: Ethan SELFRIDGE

    CPC classification number: G10L15/22 G10L15/222 G10L2015/227

    Abstract: Systems, methods, and computer-readable storage devices are for an event-driven multi-agent architecture improves via a semi-hierarchical multi-agent reinforcement learning approach. A system receives a user input during a speech dialog between a user and the system. The system then processes the user input, identifying an importance of the user input to the speech dialog based on a user classification and identifying a variable strength turn-taking signal inferred from the user input. An utterance selection agent selects an utterance for replying to the user input based on the importance of the user input, and a turn-taking agent determines whether to output the utterance based on the utterance, and the variable strength turn-taking signal. When the turn-taking agent indicates the utterance should be output, the system selects when to output the utterance.

    Abstract translation: 系统,方法和计算机可读存储设备用于通过半层次多代理强化学习方法改进的事件驱动的多代理架构。 系统在用户和系统之间的语音对话期间接收用户输入。 系统然后处理用户输入,基于用户分类识别用户输入到语音对话的重要性,并且识别从用户输入推断的可变强度转向信号。 话音选择代理基于用户输入的重要性来选择用于回复用户输入的话语,并且转向代理确定是否基于话语输出话语,以及可变强度转向信号。 当转机指示应该输出话语时,系统选择何时输出话语。

    SYSTEM AND METHOD FOR CREATING AND SHARING PLANS THROUGH MULTIMODAL DIALOG
    6.
    发明申请
    SYSTEM AND METHOD FOR CREATING AND SHARING PLANS THROUGH MULTIMODAL DIALOG 有权
    通过多模式对话创建和共享计划的系统和方法

    公开(公告)号:US20160179908A1

    公开(公告)日:2016-06-23

    申请号:US14577311

    申请日:2014-12-19

    CPC classification number: G06F3/04847 G06F17/3087

    Abstract: Methods, systems, devices, and media for creating a plan through multimodal search inputs are provided. A multimodal virtual assistant receives a first search request which comprises a geographic area. First search results are displayed in response to the first search request being received. The first search results are based on the first search request and correspond to the geographic area. Each of the first search results is associated with a geographic location. The multimodal virtual assistant receives a selection of one of the first search results, and adds the selected one of the first search results to a plan. A second search request is received after the selection, and second search results are displayed in response to the second search request being received. The second search results are based on the second search request and correspond to the geographic location of the selected one of the first search results.

    Abstract translation: 提供了通过多模态搜索输入创建计划的方法,系统,设备和媒体。 多模式虚拟助理接收包括地理区域的第一搜索请求。 响应于正在接收的第一搜索请求显示第一搜索结果。 第一搜索结果基于第一搜索请求并对应于地理区域。 每个第一搜索结果与地理位置相关联。 多模式虚拟助理接收第一搜索结果之一的选择,并将所选择的第一搜索结果添加到计划中。 在选择之后接收第二搜索请求,并且响应于接收到的第二搜索请求而显示第二搜索结果。 第二搜索结果基于第二搜索请求,并且对应于所选择的第一搜索结果的地理位置。

    SYSTEM AND METHOD FOR LOCALIZED ERROR DETECTION OF RECOGNITION RESULTS
    7.
    发明申请
    SYSTEM AND METHOD FOR LOCALIZED ERROR DETECTION OF RECOGNITION RESULTS 有权
    用于本地化错误检测识别结果的系统和方法

    公开(公告)号:US20160155445A1

    公开(公告)日:2016-06-02

    申请号:US14557030

    申请日:2014-12-01

    CPC classification number: G10L15/22 G10L15/01 G10L15/1822 H04M2250/74

    Abstract: A system, method and computer-readable storage devices are disclosed for using targeted clarification (TC) questions in dialog systems in a multimodal virtual agent system (MVA) providing access to information about movies, restaurants, and musical events. In contrast with open-domain spoken systems, the MVA application covers a domain with a fixed set of concepts and uses a natural language understanding (NLU) component to mark concepts in automatically recognized speech. Instead of identifying an error segment, localized error detection (LED) identifies which of the concepts are likely to be present and correct using domain knowledge, automatic speech recognition (ASR), and NLU tags and scores. If at least concept is identified to be present but not correct, the TC component uses this information to generate a targeted clarification question. This approach computes probability distributions of concept presence and correctness for each user utterance, which can apply to automatic learning for clarification policies.

    Abstract translation: 公开了一种用于在多模式虚拟代理系统(MVA)中的对话系统中使用目标澄清(TC)问题的系统,方法和计算机可读存储设备,其提供对关于电影,餐馆和音乐事件的信息的访问。 与开放域语言系统相比,MVA应用程序涵盖了具有固定概念集的域,并使用自然语言理解(NLU)组件来标记自动识别的语音中的概念。 本地化错误检测(LED)不是识别错误段,而是使用域知识,自动语音识别(ASR)和NLU标签和分数来识别哪些概念可能存在和正确。 如果至少将概念确定为存在但不正确,则TC组件使用此信息来产生有针对性的澄清问题。 这种方法计算每个用户话语的概念存在和正确性的概率分布,这可以应用于自动学习以进行澄清策略。

Patent Agency Ranking