Dynamically generating a vocal help prompt in a multimodal application
    41.
    发明授权
    Dynamically generating a vocal help prompt in a multimodal application 有权
    在多模式应用程序中动态生成声乐帮助提示

    公开(公告)号:US08086463B2

    公开(公告)日:2011-12-27

    申请号:US11530930

    申请日:2006-09-12

    IPC分类号: G10L21/00 G10L21/06

    摘要: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.

    摘要翻译: 在多模式应用中动态地产生声乐帮助提示,包括检测VoiceXML对话框的输入元素的帮助触发事件,其中使用在支持多种交互模式的多模式设备上操作的多模式应用来实现检测,包括语音模式 和一个或多个非语音模式,多模式应用程序可操作地耦合到VoiceXML解释器,并且多模式应用程序没有静态帮助文本。 在根据本发明的实施例的多模式应用中动态地产生声乐帮助提示通常还包括由VoiceXML解释器从帮助文本的源中检索帮助语音识别语法的元素的文本,由VoiceXML解释器形成 帮助文本进入声乐帮助提示,并通过多用途应用程序向用户提供通过计算机用户界面的声乐帮助提示。

    Method of enhancing voice interactions using visual messages
    42.
    发明授权
    Method of enhancing voice interactions using visual messages 有权
    使用视觉消息增强语音交互的方法

    公开(公告)号:US07966188B2

    公开(公告)日:2011-06-21

    申请号:US10441839

    申请日:2003-05-20

    IPC分类号: G10L11/00 G10L21/00

    CPC分类号: G06F3/038 G06F3/167 G10L15/26

    摘要: A method for enhancing voice interactions within a portable multimodal computing device using visual messages. A multimodal interface can be provided that includes an audio interface and a visual interface. A speech input can then be received and a voice recognition task can be performed upon at least a portion of the speech input. At least one message within the multimodal interface can be visually presented, wherein the message is a prompt for the speech input and/or a confirmation of the speech input.

    摘要翻译: 一种使用可视消息在便携式多模式计算设备内增强语音交互的方法。 可以提供包括音频接口和可视界面的多模式接口。 然后可以接收语音输入,并且可以在语音输入的至少一部分上执行语音识别任务。 可以在视觉呈现多模式界面内的至少一个消息,其中消息是用于语音输入的提示和/或语音输入的确认。

    Enabling global grammars for a particular multimodal application
    43.
    发明授权
    Enabling global grammars for a particular multimodal application 有权
    启用特定多模式应用程序的全局语法

    公开(公告)号:US07809575B2

    公开(公告)日:2010-10-05

    申请号:US11679279

    申请日:2007-02-27

    IPC分类号: G10L21/00 G10L11/00 G10L15/18

    CPC分类号: G10L15/19

    摘要: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

    摘要翻译: 描述了方法,装置和计算机程序产品,用于通过加载多模式网页来实现根据本发明的特定多模式应用的全局语法; 确定加载的多模式网页是否是特定多模式应用的多个多模式网页之一。 如果加载的多模式网页是特定多模式应用程序的多个多模式网页之一,则启用全局语法通常包括加载在多模式网页中标识的特定多模式应用程序的任何当前未加载的全局语法,并维护任何先前加载的全局语法 。 如果加载的多模式网页不是特定多模式应用程序的多个多模式网页之一,则启用全局语法通常包括卸载任何当前加载的全局语法。

    PARTIALLY FILLING MIXED-INITIATIVE FORMS FROM UTTERANCES HAVING SUB-THRESHOLD CONFIDENCE SCORES BASED UPON WORD-LEVEL CONFIDENCE DATA
    44.
    发明申请
    PARTIALLY FILLING MIXED-INITIATIVE FORMS FROM UTTERANCES HAVING SUB-THRESHOLD CONFIDENCE SCORES BASED UPON WORD-LEVEL CONFIDENCE DATA 有权
    根据词级信心数据,从具有亚阈值信心评分的新西兰部分地填充混合式主动式

    公开(公告)号:US20080243502A1

    公开(公告)日:2008-10-02

    申请号:US11692741

    申请日:2007-03-28

    IPC分类号: G10L15/26

    CPC分类号: G10L15/22 G10L15/193

    摘要: The invention discloses prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played. Accordingly, data fields are partially filled in based upon the original speech utterance, where a second prompt for unfilled fields is played.

    摘要翻译: 本发明公开了一种为多个元素提供输入的口头响应的提示。 可以接收包括多个元素的内容的单个语音话语,其中每个元素被映射到数据字段。 讲话语音可以是语音到文本转换,以导出每个多个元素的值。 可以确定话语等级置信度得分,其可以低于相关的确定性阈值。 然后可以确定每个派生元素的元素级置信度得分。 多个元素的第一组可以具有高于相关确定性阈值的元素级置信度得分,而第二组可以具有下面的得分。 值可以存储在映射到第一组的数据字段中。 可以播放第二组的输入提示。 因此,基于原始语音话语部分地填充数据字段,其中播放未填充字段的第二提示。

    Altering Behavior Of A Multimodal Application Based On Location
    45.
    发明申请
    Altering Behavior Of A Multimodal Application Based On Location 有权
    改变基于位置的多模态应用的行为

    公开(公告)号:US20080208593A1

    公开(公告)日:2008-08-28

    申请号:US11679301

    申请日:2007-02-27

    IPC分类号: G10L21/00

    CPC分类号: G10L15/22 G10L15/24

    摘要: Methods, apparatus, and products are disclosed for altering behavior of a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application, including a voice mode and one or more non-voice modes. The voice mode of user interaction with the multimodal application is supported by a voice interpreter. Altering behavior of a multimodal application based on location includes: receiving a location change notification in the voice interpreter from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device; updating, by the voice interpreter, location-based environment parameters for the voice interpreter in dependence upon the current location of the multimodal device; and interpreting, by the voice interpreter, the multimodal application in dependence upon the location-based environment parameters.

    摘要翻译: 公开了基于位置改变多模式应用的行为的方法,装置和产品。 多模式应用程序在多模式设备上运行,支持与多模式应用程序的多种用户交互模式,包括语音模式和一种或多种非语音模式。 与多模式应用程序的用户交互的语音模式由语音解释器支持。 基于位置改变多模式应用的行为包括:从设备位置管理器在语音解释器中接收位置改变通知,该设备位置管理器可操作地耦合到多模态设备的位置检测组件,位置变化通知指定当前位置 的多模式设备; 语音解释器根据多模式设备的当前位置更新语音解释器的基于位置的环境参数; 并且由语音解释器根据基于位置的环境参数来解释多模式应用。

    Pausing A VoiceXML Dialog Of A Multimodal Application
    46.
    发明申请
    Pausing A VoiceXML Dialog Of A Multimodal Application 有权
    暂停多模式应用程序的VoiceXML对话框

    公开(公告)号:US20080208584A1

    公开(公告)日:2008-08-28

    申请号:US11679236

    申请日:2007-02-27

    IPC分类号: G10L13/00 G10L11/00

    摘要: Pausing a VoiceXML dialog of a multimodal application, including generating by the multimodal application a pause event; responsive to the pause event, temporarily pausing the dialogue by the VoiceXML interpreter; generating by the multimodal application a resume event; and responsive to the resume event, resuming the dialog. Embodiments are implemented with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the VoiceXML interpreter is interpreting the VoiceXML dialog to be paused.

    摘要翻译: 暂停多模式应用程序的VoiceXML对话框,包括由多模态应用程序生成暂停事件; 响应暂停事件,VoiceXML解释器临时暂停对话; 由多模式应用程序生成一个简历事件; 并响应resume事件,恢复对话。 实施例是通过在多模式设备上操作的多模式应用来实现的,该多模式设备支持包括语音模式和一种或多种非语音模式的多种交互模式,多模式应用可操作地耦合到VoiceXML解释器,并且VoiceXML解释器正在解释VoiceXML对话 暂停

    Dynamically Generating a Vocal Help Prompt in a Multimodal Application
    47.
    发明申请
    Dynamically Generating a Vocal Help Prompt in a Multimodal Application 有权
    在多模式应用程序中动态生成声乐帮助提示

    公开(公告)号:US20080065390A1

    公开(公告)日:2008-03-13

    申请号:US11530930

    申请日:2006-09-12

    IPC分类号: G10L21/00

    摘要: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.

    摘要翻译: 在多模式应用中动态地产生声乐帮助提示,包括检测VoiceXML对话框的输入元素的帮助触发事件,其中使用在支持多种交互模式的多模式设备上操作的多模式应用来实现检测,包括语音模式 和一个或多个非语音模式,多模式应用程序可操作地耦合到VoiceXML解释器,并且多模式应用程序没有静态帮助文本。 在根据本发明的实施例的多模式应用中动态地产生声乐帮助提示通常还包括由VoiceXML解释器从帮助文本的源中检索帮助语音识别语法的元素的文本,由VoiceXML解释器形成 帮助文本进入声乐帮助提示,并通过多用途应用程序向用户提供通过计算机用户界面的声乐帮助提示。

    Dynamic help including available speech commands from content contained within speech grammars
    48.
    发明申请
    Dynamic help including available speech commands from content contained within speech grammars 有权
    动态帮助,包括语音语法中包含的内容的可用语音命令

    公开(公告)号:US20070213984A1

    公开(公告)日:2007-09-13

    申请号:US11375417

    申请日:2006-03-13

    IPC分类号: G10L15/18

    CPC分类号: G06F3/167 G10L2015/228

    摘要: A method for providing help to voice-enabled applications, including multimodal applications, can include a step of identifying at least one speech grammar associated with a voice-enabled application. Help fields can be defined within the speech grammar. The help fields can include available speech commands for the voice enabled application. When the speech grammar is activated for use by the voice-enabled application, the available speech commands can be presented to a user of the voice-enabled application. The presented speech commands can be obtained from the help fields.

    摘要翻译: 用于向包括多模式应用在内的支持语音的应用提供帮助的方法可以包括识别与支持语音的应用相关联的至少一个语音语法的步骤。 在语言语法中可以定义帮助字段。 帮助字段可以包括用于支持语音的应用程序的可用语音命令。 当语音语法激活以供由语音使能的应用使用时,可以将语音命令呈现给支持语音的应用的用户。 所提供的语音命令可以从帮助字段获得。

    Method and system of building a grammar rule with baseforms generated dynamically from user utterances
    49.
    发明申请
    Method and system of building a grammar rule with baseforms generated dynamically from user utterances 有权
    使用从用户话语动态生成的基本形式构建语法规则的方法和系统

    公开(公告)号:US20060047510A1

    公开(公告)日:2006-03-02

    申请号:US10924520

    申请日:2004-08-24

    IPC分类号: G10L15/26

    CPC分类号: G10L15/187 G10L2015/0631

    摘要: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.

    摘要翻译: 用用户话语动态生成基本形式的语法的方法(200)可包括以下步骤:(205)用户话语,使用用户话语产生(210)基形,创建或添加(215)语法规则 使用基本形式,并在语音可扩展标记语言程序的语法文档中绑定(230)语法规则。 生成基本形式可以选择性地包括向VoiceXML引入新元素,该属性使得能够从引用的记录(例如用户话语)生成基本形式。 在一个实施例中,该方法可以用于通过重复地访问包含语法规则的表单来创建(235)电话簿和语法来访问电话簿,该属性可以使得能够从引用的记录生成基本形式。