Systems and methods for extracting meaning from multimodal inputs using finite-state devices
    1.
    发明授权
    Systems and methods for extracting meaning from multimodal inputs using finite-state devices 有权
    使用有限状态设备从多模态输入中提取意义的系统和方法

    公开(公告)号:US07069215B1

    公开(公告)日:2006-06-27

    申请号:US09904253

    申请日:2001-07-12

    IPC分类号: G10L15/28

    摘要: Finite-state systems and methods allow multiple input streams to be parsed and integrated by a single finite-state device. These systems and methods not only address multimodal recognition, but are also able to encode semantics and syntax into a single finite-state device. The finite-state device provides models for recognizing multimodal inputs, such as speech and gesture, and composes the meaning content from the various input streams into a single semantic representation. Compared to conventional multimodal recognition systems, finite-state systems and methods allow for compensation among the various input streams. Finite-state systems and methods allow one input stream to dynamically alter a recognition model used for another input stream, and can reduce the computational complexity of multidimensional multimodal parsing. Finite-state devices provide a well-understood probabilistic framework for combining the probability distributions associated with the various input streams and for selecting among competing multimodal interpretations.

    摘要翻译: 有限状态系统和方法允许通过单个有限状态设备解析和集成多个输入流。 这些系统和方法不仅解决了多模态识别,而且还能够将语义和语法编码成单个有限状态的设备。 有限状态设备提供用于识别多模态输入(如语音和手势)的模型,并将来自各种输入流的含义构成单个语义表示。 与传统的多模式识别系统相比,有限状态系统和方法允许在各种输入流之间进行补偿。 有限状态系统和方法允许一个输入流动态地改变用于另一个输入流的识别模型,并且可以降低多维多模式解析的计算复杂度。 有限状态设备提供了一个很好理解的概率框架,用于组合与各种输入流相关联的概率分布,并用于在竞争的多模式解释之间进行选择。

    SYSTEMS AND METHODS FOR EXTRACTING MEANING FROM MULTIMODAL INPUTS USING FINITE-STATE DEVICES
    4.
    发明申请
    SYSTEMS AND METHODS FOR EXTRACTING MEANING FROM MULTIMODAL INPUTS USING FINITE-STATE DEVICES 有权
    使用有限状态设备从多模式输入中提取意义的系统和方法

    公开(公告)号:US20120303370A1

    公开(公告)日:2012-11-29

    申请号:US13485574

    申请日:2012-05-31

    IPC分类号: G10L15/00

    摘要: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

    摘要翻译: 多模式话语包含多种不同的模式。 这些模式可以包括语音,手势和笔,触觉和注视输入等。 本发明使用这些模式中的一个或多个的识别结果为这些模式中的一个或多个其他模式的识别过程提供补偿。 在各种示例性实施例中,多模式识别系统从这些模式中的一个或多个输入一个或多个识别网格,并且生成要由一个或多个模式识别器使用以识别一个或多个其他模式的一个或多个模型。 在一个示例性实施例中,手势识别器输入手势输入并向多模式解析器输出手势识别格点。 多模式解析器生成语言模型并将其输出到自动语音识别系统,其使用所接收的语言模型来识别对应于识别的手势输入的语音输入。

    Systems and Methods for Extracting Meaning from Multimodal Inputs Using Finite-State Devices
    5.
    发明申请
    Systems and Methods for Extracting Meaning from Multimodal Inputs Using Finite-State Devices 有权
    使用有限状态设备从多模态输入中提取含义的系统和方法

    公开(公告)号:US20120116768A1

    公开(公告)日:2012-05-10

    申请号:US13291427

    申请日:2011-11-08

    IPC分类号: G10L15/04

    摘要: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

    摘要翻译: 多模式话语包含多种不同的模式。 这些模式可以包括语音,手势和笔,触觉和注视输入等。 本发明使用这些模式中的一个或多个的识别结果为这些模式中的一个或多个其他模式的识别过程提供补偿。 在各种示例性实施例中,多模式识别系统从这些模式中的一个或多个输入一个或多个识别网格,并且生成要由一个或多个模式识别器使用以识别一个或多个其他模式的一个或多个模型。 在一个示例性实施例中,手势识别器输入手势输入并向多模式解析器输出手势识别格点。 多模式解析器生成语言模型并将其输出到自动语音识别系统,其使用所接收的语言模型来识别对应于识别的手势输入的语音输入。

    Systems and methods for extracting meaning from multimodal inputs using finite-state devices
    6.
    发明授权
    Systems and methods for extracting meaning from multimodal inputs using finite-state devices 有权
    使用有限状态设备从多模态输入中提取意义的系统和方法

    公开(公告)号:US06868383B1

    公开(公告)日:2005-03-15

    申请号:US09904252

    申请日:2001-07-12

    IPC分类号: G06K9/00 G10L15/08 G10L15/24

    摘要: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

    摘要翻译: 多模式话语包含多种不同的模式。 这些模式可以包括语音,手势和笔,触觉和注视输入等。 本发明使用这些模式中的一个或多个的识别结果为这些模式中的一个或多个其他模式的识别过程提供补偿。 在各种示例性实施例中,多模式识别系统从这些模式中的一个或多个输入一个或多个识别网格,并且生成要由一个或多个模式识别器使用以识别一个或多个其他模式的一个或多个模型。 在一个示例性实施例中,手势识别器输入手势输入并向多模式解析器输出手势识别格点。 多模式解析器生成语言模型并将其输出到自动语音识别系统,其使用所接收的语言模型来识别对应于识别的手势输入的语音输入。

    Systems and methods for extracting meaning from multimodal inputs using finite-state devices
    7.
    发明授权
    Systems and methods for extracting meaning from multimodal inputs using finite-state devices 有权
    使用有限状态设备从多模态输入中提取意义的系统和方法

    公开(公告)号:US08355916B2

    公开(公告)日:2013-01-15

    申请号:US13485574

    申请日:2012-05-31

    IPC分类号: G10L17/00

    摘要: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

    摘要翻译: 多模式话语包含多种不同的模式。 这些模式可以包括语音,手势和笔,触觉和注视输入等。 本发明使用这些模式中的一个或多个的识别结果为这些模式中的一个或多个其他模式的识别过程提供补偿。 在各种示例性实施例中,多模式识别系统从这些模式中的一个或多个输入一个或多个识别网格,并且生成要由一个或多个模式识别器使用以识别一个或多个其他模式的一个或多个模型。 在一个示例性实施例中,手势识别器输入手势输入并向多模式解析器输出手势识别格点。 多模式解析器生成语言模型并将其输出到自动语音识别系统,其使用所接收的语言模型来识别对应于识别的手势输入的语音输入。

    Systems and methods for extracting meaning from multimodal inputs using finite-state devices

    公开(公告)号:US08214212B2

    公开(公告)日:2012-07-03

    申请号:US13291427

    申请日:2011-11-08

    IPC分类号: G10L17/00

    摘要: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

    Systems and methods for extracting meaning from multimodal inputs using finite-state devices
    9.
    发明授权
    Systems and methods for extracting meaning from multimodal inputs using finite-state devices 有权
    使用有限状态设备从多模态输入中提取意义的系统和方法

    公开(公告)号:US08103502B1

    公开(公告)日:2012-01-24

    申请号:US11904085

    申请日:2007-09-26

    IPC分类号: G10L17/00

    摘要: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

    摘要翻译: 多模式话语包含多种不同的模式。 这些模式可以包括语音,手势和笔,触觉和注视输入等。 本发明使用这些模式中的一个或多个的识别结果为这些模式中的一个或多个其他模式的识别过程提供补偿。 在各种示例性实施例中,多模式识别系统从这些模式中的一个或多个输入一个或多个识别网格,并且生成要由一个或多个模式识别器使用以识别一个或多个其他模式的一个或多个模型。 在一个示例性实施例中,手势识别器输入手势输入并向多模式解析器输出手势识别格点。 多模式解析器生成语言模型并将其输出到自动语音识别系统,其使用所接收的语言模型来识别对应于识别的手势输入的语音输入。

    Systems and Methods for Generating Markup-Language Based Expressions from Multi-Modal and Unimodal Inputs
    10.
    发明申请
    Systems and Methods for Generating Markup-Language Based Expressions from Multi-Modal and Unimodal Inputs 有权
    用于从多模态和单模态输入生成基于标记语言的表达式的系统和方法

    公开(公告)号:US20100100509A1

    公开(公告)日:2010-04-22

    申请号:US12644603

    申请日:2009-12-22

    IPC分类号: G06F17/00

    摘要: When using finite-state devices to perform various functions, it is beneficial to use finite state devices representing regular grammars with terminals having markup-language-based semantics. By using markup-language-based symbols in the finite state devices, it is possible to generate valid markup-language expressions by concatenating the symbols representing the result of the performed function. The markup-language expression can be used by other applications and/or devices. Finite-state devices are used to convert strings of words and gestures into valid markup-language, for example, XML, expressions that can be used, for example, to provide an application program interface to underlying system applications.

    摘要翻译: 当使用有限状态设备来执行各种功能时,使用表示具有基于标记语言的语义的终端的常规语法的有限状态设备是有益的。 通过在有限状态设备中使用基于标记语言的符号,可以通过连接表示所执行功能的结果的符号来生成有效的标记语言表达。 标记语言表达式可以由其他应用程序和/或设备使用。 有限状态设备用于将字串和手势转换为有效的标记语言,例如XML,可以使用的表达式,例如为基础系统应用程序提供应用程序接口。