AUTOMATIC SPEECH RECOGNITION WITH A SELECTION LIST
    21.
    发明申请
    AUTOMATIC SPEECH RECOGNITION WITH A SELECTION LIST 有权
    自动语音识别与选择列表

    公开(公告)号:US20080162136A1

    公开(公告)日:2008-07-03

    申请号:US11619209

    申请日:2007-01-03

    IPC分类号: G10L15/18 G10L15/04 G10L21/00

    摘要: Methods, apparatus, and computer program products are described for automatic speech recognition (‘ASR’) that include accepting by the multimodal application speech input and visual input for selecting or deselecting items in a selection list, the speech input enabled by a speech recognition grammar; providing, from the multimodal application to the grammar interpreter, the speech input and the speech recognition grammar; receiving, by the multimodal application from the grammar interpreter, interpretation results including matched words from the grammar that correspond to items in the selection list and a semantic interpretation token that specifies whether to select or deselect items in the selection list; and determining, by the multimodal application in dependence upon the value of the semantic interpretation token, whether to select or deselect items in the selection list that correspond to the matched words.

    摘要翻译: 描述用于自动语音识别(“ASR”)的方法,装置和计算机程序产品,其包括通过多模式应用语音输入的接受和用于在选择列表中选择或取消选择项目的可视输入,由语音识别语法启用的语音输入 ; 从多模式应用程序提供语法解释器,语音输入和语音识别语法; 通过多模式应用从语法解释器接收包括对应于选择列表中的项目的语法的匹配词的解释结果和指定是否选择或取消选择列表中的项目的语义解释令牌; 以及根据所述语义解释令牌的值由所述多模式应用程序确定是否选择或取消选择列表中对应于所述匹配词的项目。

    Method and system for voice-enabled autofill

    公开(公告)号:US20060074652A1

    公开(公告)日:2006-04-06

    申请号:US11199672

    申请日:2005-08-09

    IPC分类号: G10L15/00

    摘要: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

    Method and apparatus for voice-enabling an application
    24.
    发明申请
    Method and apparatus for voice-enabling an application 有权
    用于语音启用应用程序的方法和装置

    公开(公告)号:US20050283367A1

    公开(公告)日:2005-12-22

    申请号:US10870517

    申请日:2004-06-17

    IPC分类号: G10L21/00

    CPC分类号: G10L2015/228

    摘要: A method of voice-enabling an application for command and control and content navigation can include the application dynamically generating a markup language fragment specifying a command and control and content navigation grammar for the application, instantiating an interpreter from a voice library, and providing the markup language fragment to the interpreter. The method also can include the interpreter processing a speech input using the command and control and content navigation grammar specified by the markup language fragment and providing an event to the application indicating an instruction representative of the speech input.

    摘要翻译: 语音使命令和控制和内容导航应用程序的方法可以包括应用程序动态生成指定用于应用程序的命令和控件以及内容导航语法的标记语言片段,从语音库实例化解释器,以及提供标记 语言片段到翻译。 该方法还可以包括使用由标记语言片段指定的命令和控制和内容导航语法来处理语音输入的解释器,并向应用提供指示表示语音输入的指令的事件。

    Pausing a VoiceXML dialog of a multimodal application
    25.
    发明授权
    Pausing a VoiceXML dialog of a multimodal application 有权
    暂停多模式应用程序的VoiceXML对话框

    公开(公告)号:US08713542B2

    公开(公告)日:2014-04-29

    申请号:US11679236

    申请日:2007-02-27

    摘要: Pausing a VoiceXML dialog of a multimodal application, including generating by the multimodal application a pause event; responsive to the pause event, temporarily pausing the dialogue by the VoiceXML interpreter; generating by the multimodal application a resume event; and responsive to the resume event, resuming the dialog. Embodiments are implemented with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the VoiceXML interpreter is interpreting the VoiceXML dialog to be paused.

    摘要翻译: 暂停多模式应用程序的VoiceXML对话框,包括由多模态应用程序生成暂停事件; 响应暂停事件,VoiceXML解释器临时暂停对话; 由多模式应用程序生成一个简历事件; 并响应resume事件,恢复对话。 实施例是通过在多模式设备上操作的多模式应用来实现的,该多模式设备支持包括语音模式和一种或多种非语音模式的多种交互模式,多模式应用可操作地耦合到VoiceXML解释器,并且VoiceXML解释器正在解释VoiceXML对话 暂停

    Dynamic help including available speech commands from content contained within speech grammars
    26.
    发明授权
    Dynamic help including available speech commands from content contained within speech grammars 有权
    动态帮助,包括语音语法中包含的内容的可用语音命令

    公开(公告)号:US08311836B2

    公开(公告)日:2012-11-13

    申请号:US11375417

    申请日:2006-03-13

    IPC分类号: G10L21/00 G10L15/04

    CPC分类号: G06F3/167 G10L2015/228

    摘要: A method for providing help to voice-enabled applications, including multimodal applications, can include a step of identifying at least one speech grammar associated with a voice-enabled application. Help fields can be defined within the speech grammar. The help fields can include available speech commands for the voice enabled application. When the speech grammar is activated for use by the voice-enabled application, the available speech commands can be presented to a user of the voice-enabled application. The presented speech commands can be obtained from the help fields.

    摘要翻译: 用于向包括多模式应用在内的支持语音的应用提供帮助的方法可以包括识别与支持语音的应用相关联的至少一个语音语法的步骤。 在语言语法中可以定义帮助字段。 帮助字段可以包括用于支持语音的应用程序的可用语音命令。 当语音语法激活以供由语音使能的应用使用时,可以将语音命令呈现给支持语音的应用的用户。 所提供的语音命令可以从帮助字段获得。

    DYNAMICALLY GENERATING A VOCAL HELP PROMPT IN A MULTIMODAL APPLICATION
    27.
    发明申请
    DYNAMICALLY GENERATING A VOCAL HELP PROMPT IN A MULTIMODAL APPLICATION 审中-公开
    动态地在多模式应用程序中生成VOCAL帮助提示

    公开(公告)号:US20120065982A1

    公开(公告)日:2012-03-15

    申请号:US13303380

    申请日:2011-11-23

    IPC分类号: G10L21/00

    摘要: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.

    摘要翻译: 在多模式应用中动态地产生声乐帮助提示,包括检测VoiceXML对话框的输入元素的帮助触发事件,其中使用在支持多种交互模式的多模式设备上操作的多模式应用来实现检测,包括语音模式 和一个或多个非语音模式,多模式应用程序可操作地耦合到VoiceXML解释器,并且多模式应用程序没有静态帮助文本。 在根据本发明的实施例的多模式应用中动态地产生声乐帮助提示通常还包括由VoiceXML解释器从帮助文本的源中检索帮助语音识别语法的元素的文本,由VoiceXML解释器形成 帮助文本进入声乐帮助提示,并通过多用途应用程序向用户提供通过计算机用户界面的声乐帮助提示。

    Enabling global grammars for a particular multimodal application
    28.
    发明授权
    Enabling global grammars for a particular multimodal application 有权
    启用特定多模式应用程序的全局语法

    公开(公告)号:US08073698B2

    公开(公告)日:2011-12-06

    申请号:US12873149

    申请日:2010-08-31

    IPC分类号: G10L21/00 G10L11/00 G10L15/18

    CPC分类号: G10L15/19

    摘要: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

    摘要翻译: 描述了方法,装置和计算机程序产品,用于通过加载多模式网页来实现根据本发明的特定多模式应用的全局语法; 确定加载的多模式网页是否是特定多模式应用的多个多模式网页之一。 如果加载的多模式网页是特定多模式应用程序的多个多模式网页之一,则启用全局语法通常包括加载在多模式网页中标识的特定多模式应用程序的任何当前未加载的全局语法,并维护任何先前加载的全局语法 。 如果加载的多模式网页不是特定多模式应用程序的多个多模式网页之一,则启用全局语法通常包括卸载任何当前加载的全局语法。

    Method and system for voice-enabled autofill
    29.
    发明授权
    Method and system for voice-enabled autofill 有权
    语音自动填充的方法和系统

    公开(公告)号:US07953597B2

    公开(公告)日:2011-05-31

    申请号:US11199672

    申请日:2005-08-09

    IPC分类号: G10L15/26 G06F17/00 G10L15/00

    摘要: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

    摘要翻译: 提供了一种计算机实现的方法和系统,用于响应于语音说话填充基于图形的表单字段。 计算机实现的方法包括生成对应于表单域的语法,语法基于用户简档并且包括语义解释字符串。 所述方法还包括基于所述至少一个语法并且响应于所述语音话语来创建自动填充事件,所述自动填充事件导致用与所述用户简档对应的数据填写所述表单域。 该系统包括用于生成对应于表单域的语法的语法生成模块,所述语法基于用户简档并且包括语义解释字符串。 该系统还包括一个事件模块,用于基于该至少一个语法创建一个自动填充事件,并且响应于语音话语,该事件导致用对应于用户简档的数据填写表单域。

    Ordering recognition results produced by an automatic speech recognition engine for a multimodal application
    30.
    发明授权
    Ordering recognition results produced by an automatic speech recognition engine for a multimodal application 有权
    为多模式应用程序的自动语音识别引擎生成的订购识别结果

    公开(公告)号:US07840409B2

    公开(公告)日:2010-11-23

    申请号:US11679284

    申请日:2007-02-27

    IPC分类号: G10L21/06

    摘要: Ordering recognition results produced by an automatic speech recognition (‘ASR’) engine for a multimodal application implemented with a grammar of the multimodal application in the ASR engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a weight for each recognition result; and sorting, by the VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.

    摘要翻译: 通过使用ASR引擎中的多模式应用程序的语法实现的多模式应用程序的自动语音识别(“ASR”)引擎进行的订购识别结果,多模式应用程序在支持多种交互模式的多模式设备的多模式浏览器中运行 包括语音模式和一个或多个非语音模式,通过VoiceXML解释器可操作地耦合到ASR引擎的多模式应用包括:在来自多模式应用的VoiceXML解释器中接收语音话语; 通过使用ASR引擎的VoiceXML解释器,根据语音发音和语法来确定多个识别结果; 通过VoiceXML解释器根据语法的语义解释脚本确定每个识别结果的权重; 以及由VoiceXML解释器根据每个识别结果的权重对多个识别结果进行排序。