Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections
    1.
    发明授权
    Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections 有权
    自动语音识别学习使用分类和选择性并入用户发起的更正

    公开(公告)号:US08280733B2

    公开(公告)日:2012-10-02

    申请号:US12884434

    申请日:2010-09-17

    摘要: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

    摘要翻译: 自动语音识别系统识别用户对规定文本的改变,并且推测这种改变是否由用户改变主意而产生,或者这些改变是否是识别错误的结果。 如果检测到识别错误,则系统使用用户校正的类型进行自身修改,以减少再次发生这种识别错误的可能性。 因此,该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

    Automatic speech recognition learning using user corrections
    2.
    发明授权
    Automatic speech recognition learning using user corrections 有权
    自动语音识别学习使用用户更正

    公开(公告)号:US08019602B2

    公开(公告)日:2011-09-13

    申请号:US10761451

    申请日:2004-01-20

    IPC分类号: G10L15/00 G10L15/26 G10L21/00

    摘要: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

    摘要翻译: 自动语音识别系统识别用户对规定文本的改变,并且推测这种改变是否由用户改变主意而产生,或者这些改变是否是识别错误的结果。 如果检测到识别错误,则系统使用用户校正的类型进行自身修改,以减少再次发生这种识别错误的可能性。 因此,该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

    SYSTEM AND METHOD FOR EFFICIENT LASER PROCESSING OF A MOVING WEB-BASED MATERIAL
    3.
    发明申请
    SYSTEM AND METHOD FOR EFFICIENT LASER PROCESSING OF A MOVING WEB-BASED MATERIAL 有权
    用于基于网络的移动材料的高效激光加工的系统和方法

    公开(公告)号:US20110015927A1

    公开(公告)日:2011-01-20

    申请号:US12884434

    申请日:2010-09-17

    IPC分类号: G10L15/26

    摘要: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

    摘要翻译: 自动语音识别系统识别用户对规定文本的改变,并且推测这种改变是否由用户改变主意而产生,或者这些改变是否是识别错误的结果。 如果检测到识别错误,则系统使用用户校正的类型进行自身修改,以减少再次发生这种识别错误的可能性。 因此,该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

    Automatic speech recognition learning using user corrections
    4.
    发明申请
    Automatic speech recognition learning using user corrections 有权
    自动语音识别学习使用用户更正

    公开(公告)号:US20050159949A1

    公开(公告)日:2005-07-21

    申请号:US10761451

    申请日:2004-01-20

    IPC分类号: G10L15/22 G10L15/00

    摘要: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

    摘要翻译: 自动语音识别系统识别用户对规定文本的改变,并且推测这种改变是否由用户改变主意而产生,或者这些改变是否是识别错误的结果。 如果检测到识别错误,则系统使用用户校正的类型进行自身修改,以减少再次发生这种识别错误的可能性。 因此,该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

    Modelling and processing filled pauses and noises in speech recognition
    5.
    发明授权
    Modelling and processing filled pauses and noises in speech recognition 失效
    在语音识别中建模和处理填充的暂停和噪声

    公开(公告)号:US07076422B2

    公开(公告)日:2006-07-11

    申请号:US10388259

    申请日:2003-03-13

    申请人: Mei-Yuh Hwang

    发明人: Mei-Yuh Hwang

    IPC分类号: G10L15/20

    CPC分类号: G10L15/142 G10L2021/02168

    摘要: A speech recognition system recognizes filled pause utterances made by a speaker. In one embodiment, an ergodic model is used to acoustically model filled pauses that provides flexibility allowing varying utterances of the filled pauses to be made. The ergodic HMM model can also be used for other types of noise such as but limited to breathing, keyboard operation, microphone noise, laughter, door openings and/or closings, or any other noise occurring in the environment of the user or made by the user. Similarly, silence can be modeled using an ergodic HMM model. Recognition can be used with N-gram, context-free grammar or hybrid language models.

    摘要翻译: 语音识别系统识别扬声器产生的填充暂停发声。 在一个实施例中,遍历模型用于声学地建模填充暂停,其提供灵活性,允许进行填充暂停的变化的话语。 遍历式HMM模型还可以用于其他类型的噪声,例如但不限于呼吸,键盘操作,麦克风噪音,笑声,门开启和/或关闭,或者在用户的环境中发生的或由 用户。 类似地,可以使用遍历HMM模型来建模沉默。 识别可以与N-gram,上下文无关的语法或混合语言模型一起使用。

    Method for adding phonetic descriptions to a speech recognition lexicon
    6.
    发明授权
    Method for adding phonetic descriptions to a speech recognition lexicon 失效
    将语音描述添加到语音识别词典中的方法

    公开(公告)号:US06973427B2

    公开(公告)日:2005-12-06

    申请号:US09748453

    申请日:2000-12-26

    CPC分类号: G10L15/063 G10L2015/0636

    摘要: A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, two possible phonetic descriptions are generated. One phonetic description is formed from the text of the word. The other phonetic description is formed by decoding a speech signal representing the user's pronunciation of the word. Both phonetic descriptions are scored based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.

    摘要翻译: 一种方法和计算机可读介质将单词的文本和用户的该单词的发音转换成要添加到语音识别词典的语音描述中。 最初,会生成两个可能的语音描述。 一个语音描述从单词的文字形成。 另一个语音描述是通过对表示用户对该单词的发音的语音信号进行解码形成的。 基于与用户发音的对应关系,语音描述都得分。 然后选择具有最高分数的语音描述,用于语音识别词典中的输入。

    Speech recognition with mixtures of bayesian networks
    7.
    发明授权
    Speech recognition with mixtures of bayesian networks 有权
    语音识别与贝叶斯网络的混合

    公开(公告)号:US06336108B1

    公开(公告)日:2002-01-01

    申请号:US09220197

    申请日:1998-12-23

    IPC分类号: G06F1518

    摘要: The invention performs speech recognition using an array of mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of those states. In accordance with the invention, the MBNs encode the probabilities of observing the sets of acoustic observations given the utterance of a respective one of said parts of speech. Each of the HSBNs encodes the probabilities of observing the sets of acoustic observations given the utterance of a respective one of the parts of speech and given a hidden common variable being in a particular state. Each HSBN has nodes corresponding to the elements of the acoustic observations. These nodes store probability parameters corresponding to the probabilities with causal links representing dependencies between ones of said nodes.

    摘要翻译: 本发明使用贝叶斯网络混合的阵列来执行语音识别。 贝叶斯网络(MBN)的混合由多个具有隐藏和观察变量的假设特定贝叶斯网络(HSBN)组成。 常见的外部隐藏变量与MBN相关联,但不包括在任何HSBN中。 MBN中的HSBN的数量对应于共同外部隐藏变量的状态数,并且每个HSBN在假设下共同的外部隐藏变量处于相应的一个状态的假设下对世界进行建模。 根据本发明,MBN编码了考虑到所述话音部分中的相应一个的话语来观察声学观测组的概率。 每个HSBN编码观察给定语音相应的一个语音的发音并给出隐藏的公共变量处于特定状态的声学观察组的概率。 每个HSBN具有对应于声学观测元素的节点。 这些节点存储对应于概率的概率参数,其中因果链接表示所述节点之间的依赖关系。

    New-word pronunciation learning using a pronunciation graph
    8.
    发明授权
    New-word pronunciation learning using a pronunciation graph 失效
    新词发音学习使用发音图

    公开(公告)号:US07590533B2

    公开(公告)日:2009-09-15

    申请号:US10796921

    申请日:2004-03-10

    申请人: Mei-Yuh Hwang

    发明人: Mei-Yuh Hwang

    IPC分类号: G10L15/00

    摘要: A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, a plurality of at least two possible phonetic descriptions are generated. One phonetic description is formed by decoding a speech signal representing a user's pronunciation of the word. At least one other phonetic description is generated from the text of the word. The plurality of possible sequences comprising speech-based and text-based phonetic descriptions are aligned and scored in a single graph based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.

    摘要翻译: 一种方法和计算机可读介质将单词的文本和用户的该单词的发音转换成要添加到语音识别词典的语音描述中。 最初,产生多个至少两个可能的语音描述。 通过对表示用户的单词发音的语音信号进行解码来形成一个语音描述。 从单词的文本生成至少一个其他语音描述。 基于其与用户的发音的对应关系,包括基于语音和基于文本的语音描述的多个可能序列在单个图中对齐和记分。 然后选择具有最高分数的语音描述,用于语音识别词典中的输入。

    Generating large units of graphonemes with mutual information criterion for letter to sound conversion
    10.
    发明授权
    Generating large units of graphonemes with mutual information criterion for letter to sound conversion 失效
    生成具有相互信息标准的大单位图形,用于字母转换

    公开(公告)号:US07693715B2

    公开(公告)日:2010-04-06

    申请号:US10797358

    申请日:2004-03-10

    IPC分类号: G10L15/04

    CPC分类号: G10L13/08

    摘要: A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.

    摘要翻译: 提供了一种用于将单词分割成组成部分的方法和装置。 根据本发明,确定在一组单词中发现的一对图形单元的互信息得分。 每个图形单元至少包含一个字母。 基于相互信息得分组合一对图形单元的图形单位。 这形成一个新的图形单元。 在本发明的一个方面,使用相互信息将已经被分段成音节的单词训练在一个音节的n-gram模型上。 音节n-gram模型用于将新词的语音表示分割成音节。 类似地,使用相互信息形成语素的清单,并且训练语素n-gram,其可以用于将新单词分割成语素序列。