Method and apparatus for improved speech recognition with supplementary information
    Method and apparatus for improved speech recognition with supplementary information 有权





    CPC classification number: H04M1/271 G10L15/08 G10L15/10 G10L15/22

    Abstract: A method for improving recognition results of a speech recognizer uses supplementary information to confirm recognition results. A user inputs speech to a speech recognizer. The speech recognizer resides on a mobile device or on a server at a remote location. The speech recognizer determines a recognition result based on the input speech. A confidence measure is calculated for the recognition result. If the confidence measure is below a threshold, the user is prompted for supplementary data. The supplementary data is determined dynamically based on ambiguities between the input speech and the recognition result, wherein the supplementary data will distinguish the input speech over potential incorrect results. The supplementary data may be a subset of alphanumeric characters that comprise the input speech, or other data associated with a desired result, such as an area code or location. The user may provide the supplementary data verbally, or manually using a keypad, touchpad, touchscreen, or stylus pen.

    Abstract translation: 用于改善语音识别器的识别结果的方法使用补充信息来确认识别结果。 用户向语音识别器输入语音。 语音识别器驻留在移动设备或远程位置的服务器上。 语音识别器基于输入语音来确定识别结果。 计算识别结果的置信度量。 如果置信度量值低于阈值,则会提示用户提供补充数据。 基于输入语音和识别结果之间的模糊度来动态地确定补充数据,其中补充数据将通过潜在的不正确结果区分输入语音。 补充数据可以是组成输入语音的字母数字字符的子集,或与期望结果相关联的其他数据,例如区域代码或位置。 用户可以口头提供补充数据,或者使用键盘,触摸板,触摸屏或触控笔手动提供补充数据。

    Interactive personalized robot for home use
    Interactive personalized robot for home use 有权





    CPC classification number: G06N3/008

    Abstract: An interactive personalized robotic system for a home environment includes a home network in communication with at least one electronic device. A robot is in communication with the home network and is capable of controlling the at least one electronic device. The robot further includes a plurality of modules for personally communicating with a user. The user can control the robot and the at least one electronic device by communicating with the robot.

    Abstract translation: 用于家庭环境的交互式个性化机器人系统包括与至少一个电子设备通信的家庭网络。 机器人与家庭网络通信并且能够控制至少一个电子设备。 机器人还包括用于与用户亲自通信的多个模块。 用户可以通过与机器人通信来控制机器人和至少一个电子设备。

    Personalized agent for portable devices and cellular phone
    Personalized agent for portable devices and cellular phone 有权





    Abstract: Personalized agent services are provided in a personal messaging device, such as a cellular telephone or personal digital assistant, through services of a speech recognizer that converts speech into text and a text-to-speech synthesizer that converts text to speech. Both recognizer and synthesizer may be server-based or locally deployed within the device. The user dictates an e-mail message which is converted to text and stored. The stored text is sent back to the user as text or as synthesized speech, to allow the user to edit the message and correct transcription errors before sending as e-mail. The system includes a summarization module that prepares short summaries of incoming e-mail and voice mail. The user may access these summaries, and retrieve and organize email and voice mail using speech commands.

    Abstract translation: 通过将语音转换为文本的语音识别器的服务和将文本转换为语音的文本到语音合成器,个性化代理服务被提供在诸如蜂窝电话或个人数字助理的个人消息设备中。 识别器和合成器可以是基于服务器的或本地部署在设备内。 用户指定一个电子邮件消息,转换为文本并存储。 存储的文本作为文本或合成语音发送回用户,以允许用户在作为电子邮件发送之前编辑消息并纠正转录错误。 该系统包括一个汇总模块,准备收到的电子邮件和语音邮件的简要摘要。 用户可以访问这些摘要,并使用语音命令检索和组织电子邮件和语音邮件。

    Method and apparatus for improved speech recognition with supplementary information
    Method and apparatus for improved speech recognition with supplementary information 有权





    CPC classification number: H04M1/271 G10L15/08 G10L15/10 G10L15/22

    Abstract: A method for improving recognition results of a speech recognizer uses supplementary information to confirm recognition results. A user inputs speech to a speech recognizer. The speech recognizer resides on a mobile device or on a server at a remote location. The speech recognizer determines a recognition result based on the input speech. A confidence measure is calculated for the recognition result. If the confidence measure is below a threshold, the user is prompted for supplementary data. The supplementary data is determined dynamically based on ambiguities between the input speech and the recognition result, wherein the supplementary data will distinguish the input speech over potential incorrect results. The supplementary data may be a subset of alphanumeric characters that comprise the input speech, or other data associated with a desired result, such as an area code or location. The user may provide the supplementary data verbally, or manually using a keypad, touchpad, touchscreen, or stylus pen.

    Abstract translation: 用于改善语音识别器的识别结果的方法使用补充信息来确认识别结果。 用户向语音识别器输入语音。 语音识别器驻留在移动设备或远程位置的服务器上。 语音识别器基于输入语音来确定识别结果。 计算识别结果的置信度量。 如果置信度量值低于阈值,则会提示用户提供补充数据。 基于输入语音和识别结果之间的模糊度来动态地确定补充数据,其中补充数据将通过潜在的不正确结果区分输入语音。 补充数据可以是组成输入语音的字母数字字符的子集,或与期望结果相关联的其他数据,例如区域代码或位置。 用户可以口头提供补充数据,或者使用键盘,触摸板,触摸屏或触控笔手动提供补充数据。

    Dialogue device for call screening and classification
    Dialogue device for call screening and classification 失效





    CPC classification number: H04M1/663 H04M1/271 H04M1/57 H04M3/436

    Abstract: The call screener employs a telephone system interface connected between a telephone network and a telephone device of a user. The interface selectively routes calls (and refrain from routing calls) based on the results from the dialogue system. The dialogue system elicits speech from an incoming caller and causes the telephone system interface to route calls from the incoming caller based on a comparison of the elicited speech with a set of stored speaker models. The stored speaker models may be maintained automatically by the system, using either a passive mode, in which calls exceeding a predetermined duration are assumed to be “acceptable” callers; and a proactive mode in which the system prompts the user at the end of the call to elect whether to save the speech models developed during that call in the acceptable user database. If desired, the user can attach other attributes or special tags to the stored models, indicating special handling or call routing rules to be applied when that caller calls again.

    Abstract translation: 呼叫筛选器使用连接在电话网络和用户的电话设备之间的电话系统接口。 该接口根据对话系统的结果选择性地路由呼叫(并避免路由呼叫)。 对话系统从来电者引出语音,并且使得电话系统接口基于所引出的语音与一组存储的扬声器模型的比较,来自来话呼叫者的呼叫路由。 存储的扬声器模型可以由系统自动维护,使用被动模式,其中超过预定持续时间的呼叫被认为是“可接受的”呼叫者; 以及主动模式,其中系统在呼叫结束时提示用户选择是否将在该呼叫期间开发的语音模型保存在可接受的用户数据库中。 如果需要,用户可以将其他属性或特殊标签附加到所存储的模型,指示在该呼叫者再次呼叫时应用的特殊处理或呼叫路由规则。

    Context-dependent acoustic models for medium and large vocabulary speech recognition with eigenvoice training
    Context-dependent acoustic models for medium and large vocabulary speech recognition with eigenvoice training 有权





    CPC classification number: G10L15/07

    Abstract: A reduced dimensionality eigenvoice analytical technique is used during training to develop context-dependent acoustic models for allophones. The eigenvoice technique is also used during run time upon the speech of a new speaker. The technique removes individual speaker idiosyncrasies, to produce more universally applicable and robust allophone models. In one embodiment the eigenvoice technique is used to identify the centroid of each speaker, which may then be “subtracted out” of the recognition equation. In another embodiment maximum likelihood estimation techniques are used to develop common decision tree frameworks that may be shared across all speakers when constructing the eigenvoice representation of speaker space.

    Abstract translation: 在训练期间使用减小的维度本征语音分析技术来开发用于异音素的上下文相关的声学模型。 特定语音技术在运行时也用于新演讲者的演讲。 该技术可以消除单个扬声器的特性,从而产生更普遍适用和强大的异音模型。 在一个实施例中,本征语音技术用于识别每个说话者的质心,然后可以将其“减去”识别方程。 在另一个实施例中,最大似然估计技术用于开发在构建扬声器空间的本征声表示时可以在所有扬声器之间共享的共同决策树框架。

    System for identifying and adapting a TV-user profile by means of speech technology
    System for identifying and adapting a TV-user profile by means of speech technology 有权





    Abstract: Speech input supplied by the user is evaluated by the speaker verification/identification module, and based on the evaluation, parameters are retrieved from a user profile database. These parameters adapt the speech models of the speech recognizer and also supply the natural language parser with customized dialog grammars. The user's speech is then interpreted by the speech recognizer and natural language parser to determine the meaning of the user's spoken input in order to control the television tuner. The parser works in conjunction with a command module that mediates the dialog with the user, providing on-screen prompts or synthesized speech queries to elicit further input from the user when needed. The system integrates with an electronic program guide, so that the natural language parser is made aware of what programs are available when conducting the synthetic dialog with the user.

    Abstract translation: 由用户提供的语音输入由说话人验证/识别模块进行评估,并且基于评估,从用户简档数据库检索参数。 这些参数适应语音识别器的语音模型,并为自然语言解析器提供定制的对话语法。 用户的语音然后由语音识别器和自然语言解析器进行解释,以确定用户的口头输入的含义,以控制电视调谐器。 解析器与一个命令模块一起工作,该模块与用户中介对话,提供屏幕提示或合成语音查询,以便在需要时从用户中引出进一步的输入。 该系统与电子节目指南集成,使得自然语言解析器在与用户进行合成对话时了解哪些程序可用。

    Method and system for automatically determining phonetic transcriptions associated with spelled words
    Method and system for automatically determining phonetic transcriptions associated with spelled words 失效





    CPC classification number: G10L15/065 G10L2015/086

    Abstract: New entries are added to the lexicon by entering them as spelled words. A transcription generator, such as a decision-tree-based phoneme or morpheme transcription generator, converts each spelled word into a set of n-best transcriptions or sequences. Meanwhile, user input or automatically generated speech corresponding to the spelled word is processed by an automatic speech recognizer and the recognizer rescores the transcriptions or sequences produced by the transcription generator. One or more of the highest scored (highest confidence) transcriptions may be added to the lexicon to update it. If desired, the spelled word-pronunciation pairs generated by the system can be used to retrain the transcription generator, making the system adaptive or self-learning.

    Abstract translation: 通过输入新词条作为拼写单词添加到词典中。 转录生成器,例如基于决策树的音素或语素转录发生器,将每个拼写单词转换为一组n个最佳转录或序列。 同时,由自动语音识别器处理对应于拼写字的用户输入或自动产生的语音,并且识别器重新分配由转录发生器产生的转录或序列。 可以将一个或多个最高得分(最高置信度)转录添加到词典中进行更新。 如果需要,系统产生的拼写字 - 发音对可用于重新训练转录发生器,使系统自适应或自学习。

Patent Agency Ranking