Modelling and processing filled pauses and noises in speech recognition
    1.
    发明授权
    Modelling and processing filled pauses and noises in speech recognition 失效
    在语音识别中建模和处理填充的暂停和噪声

    公开(公告)号:US07076422B2

    公开(公告)日:2006-07-11

    申请号:US10388259

    申请日:2003-03-13

    申请人: Mei-Yuh Hwang

    发明人: Mei-Yuh Hwang

    IPC分类号: G10L15/20

    CPC分类号: G10L15/142 G10L2021/02168

    摘要: A speech recognition system recognizes filled pause utterances made by a speaker. In one embodiment, an ergodic model is used to acoustically model filled pauses that provides flexibility allowing varying utterances of the filled pauses to be made. The ergodic HMM model can also be used for other types of noise such as but limited to breathing, keyboard operation, microphone noise, laughter, door openings and/or closings, or any other noise occurring in the environment of the user or made by the user. Similarly, silence can be modeled using an ergodic HMM model. Recognition can be used with N-gram, context-free grammar or hybrid language models.

    摘要翻译: 语音识别系统识别扬声器产生的填充暂停发声。 在一个实施例中,遍历模型用于声学地建模填充暂停,其提供灵活性,允许进行填充暂停的变化的话语。 遍历式HMM模型还可以用于其他类型的噪声,例如但不限于呼吸,键盘操作,麦克风噪音,笑声,门开启和/或关闭,或者在用户的环境中发生的或由 用户。 类似地,可以使用遍历HMM模型来建模沉默。 识别可以与N-gram,上下文无关的语法或混合语言模型一起使用。

    Generating a task-adapted acoustic model from one or more supervised and/or unsupervised corpora
    2.
    发明授权
    Generating a task-adapted acoustic model from one or more supervised and/or unsupervised corpora 失效
    从一个或多个监督和/或无监督的语料库生成任务适应的声学模型

    公开(公告)号:US07031918B2

    公开(公告)日:2006-04-18

    申请号:US10103184

    申请日:2002-03-20

    申请人: Mei Yuh Hwang

    发明人: Mei Yuh Hwang

    IPC分类号: G10L15/06

    摘要: Unsupervised speech data is provided to a speech recognizer that recognizes the speech data and outputs a recognition result along with a confidence measure for each recognized word. A task-related acoustic model is generated based on the recognition result, the speech data and the confidence measure. Additional task independent model can be used. The speech data can be weighted by the confidence measure in generating the acoustic model so that only data that has been recognized with a high degree of confidence will weigh heavily in generation of the acoustic model. The acoustic model can be formed from a Gaussian mean and variance of the data.

    摘要翻译: 将无监督的语音数据提供给识别语音数据的语音识别器,并输出识别结果以及每个识别的词的置信度量度。 基于识别结果,语音数据和置信度度量生成与任务相关的声学模型。 可以使用附加的任务独立模型。 语音数据可以通过产生声学模型中的置信度量来加权,使得只有以高度置信度识别的数据将在声学模型的产生中重度地重度。 声学模型可以由数据的高斯均值和方差来形成。

    Method for adding phonetic descriptions to a speech recognition lexicon
    3.
    发明授权
    Method for adding phonetic descriptions to a speech recognition lexicon 失效
    将语音描述添加到语音识别词典中的方法

    公开(公告)号:US06973427B2

    公开(公告)日:2005-12-06

    申请号:US09748453

    申请日:2000-12-26

    CPC分类号: G10L15/063 G10L2015/0636

    摘要: A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, two possible phonetic descriptions are generated. One phonetic description is formed from the text of the word. The other phonetic description is formed by decoding a speech signal representing the user's pronunciation of the word. Both phonetic descriptions are scored based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.

    摘要翻译: 一种方法和计算机可读介质将单词的文本和用户的该单词的发音转换成要添加到语音识别词典的语音描述中。 最初,会生成两个可能的语音描述。 一个语音描述从单词的文字形成。 另一个语音描述是通过对表示用户对该单词的发音的语音信号进行解码形成的。 基于与用户发音的对应关系,语音描述都得分。 然后选择具有最高分数的语音描述,用于语音识别词典中的输入。

    Speech recognition with mixtures of bayesian networks
    4.
    发明授权
    Speech recognition with mixtures of bayesian networks 有权
    语音识别与贝叶斯网络的混合

    公开(公告)号:US06336108B1

    公开(公告)日:2002-01-01

    申请号:US09220197

    申请日:1998-12-23

    IPC分类号: G06F1518

    摘要: The invention performs speech recognition using an array of mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of those states. In accordance with the invention, the MBNs encode the probabilities of observing the sets of acoustic observations given the utterance of a respective one of said parts of speech. Each of the HSBNs encodes the probabilities of observing the sets of acoustic observations given the utterance of a respective one of the parts of speech and given a hidden common variable being in a particular state. Each HSBN has nodes corresponding to the elements of the acoustic observations. These nodes store probability parameters corresponding to the probabilities with causal links representing dependencies between ones of said nodes.

    摘要翻译: 本发明使用贝叶斯网络混合的阵列来执行语音识别。 贝叶斯网络(MBN)的混合由多个具有隐藏和观察变量的假设特定贝叶斯网络(HSBN)组成。 常见的外部隐藏变量与MBN相关联,但不包括在任何HSBN中。 MBN中的HSBN的数量对应于共同外部隐藏变量的状态数,并且每个HSBN在假设下共同的外部隐藏变量处于相应的一个状态的假设下对世界进行建模。 根据本发明,MBN编码了考虑到所述话音部分中的相应一个的话语来观察声学观测组的概率。 每个HSBN编码观察给定语音相应的一个语音的发音并给出隐藏的公共变量处于特定状态的声学观察组的概率。 每个HSBN具有对应于声学观测元素的节点。 这些节点存储对应于概率的概率参数,其中因果链接表示所述节点之间的依赖关系。

    Statistical machine translation based search query spelling correction

    公开(公告)号:US10176168B2

    公开(公告)日:2019-01-08

    申请号:US13296640

    申请日:2011-11-15

    IPC分类号: G06F17/30 G06F17/28 G06F17/27

    摘要: Statistical Machine Translation (SMT) based search query spelling correction techniques are described herein. In one or more implementations, search data regarding searches performed by clients may be logged. The logged data includes query correction pairs that may be used to ascertain error patterns indicating how misspelled substrings may be translated to corrected substrings. The error patterns may be used to determine suggestions for an input query and to develop query correction models used to translate the input query to a corrected query. In one or more implementations, probabilistic features from multiple query correction models are combined to score different correction candidates. One or more top scoring correction candidates may then be exposed as suggestions for selection by a user and/or provided to a search engine to conduct a corresponding search using the corrected query version(s).

    SYSTEM AND METHOD FOR EFFICIENT LASER PROCESSING OF A MOVING WEB-BASED MATERIAL
    6.
    发明申请
    SYSTEM AND METHOD FOR EFFICIENT LASER PROCESSING OF A MOVING WEB-BASED MATERIAL 有权
    用于基于网络的移动材料的高效激光加工的系统和方法

    公开(公告)号:US20110015927A1

    公开(公告)日:2011-01-20

    申请号:US12884434

    申请日:2010-09-17

    IPC分类号: G10L15/26

    摘要: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

    摘要翻译: 自动语音识别系统识别用户对规定文本的改变,并且推测这种改变是否由用户改变主意而产生,或者这些改变是否是识别错误的结果。 如果检测到识别错误,则系统使用用户校正的类型进行自身修改,以减少再次发生这种识别错误的可能性。 因此,该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

    Method and apparatus for constructing and using syllable-like unit language models
    7.
    发明申请
    Method and apparatus for constructing and using syllable-like unit language models 有权
    用于构建和使用音节类单位语言模型的方法和装置

    公开(公告)号:US20050187769A1

    公开(公告)日:2005-08-25

    申请号:US11110602

    申请日:2005-04-20

    IPC分类号: G10L15/06 G10L15/00

    CPC分类号: G10L15/063 G10L2015/0636

    摘要: A method and computer-readable medium use syllable-like units (SLUs) to decode a pronunciation into a phonetic description. The syllable-like units are generally larger than a single phoneme but smaller than a word. The present invention provides a means for defining these syllable-like units and for generating a language model based on these syllable-like units that can be used in the decoding process. As SLUs are longer than phonemes, they contain more acoustic contextual clues and better lexical constraints for speech recognition. Thus, the phoneme accuracy produced from SLU recognition is much better than all-phone sequence recognition.

    摘要翻译: 一种方法和计算机可读介质使用音节类单位(SLU)来将发音解码成语音描述。 音节式单元通常大于单个音素,但小于一个单词。 本发明提供了一种用于定义这些音节单元并且用于基于这些可以在解码过程中使用的音节单元来生成语言模型的装置。 由于SLU比音素长,它们包含更多的声学语境线索和语音识别的更好的词汇约束。 因此,从SLU识别产生的音素精度比全电话序列识别好得多。

    Automatic speech recognition learning using user corrections
    8.
    发明申请
    Automatic speech recognition learning using user corrections 有权
    自动语音识别学习使用用户更正

    公开(公告)号:US20050159949A1

    公开(公告)日:2005-07-21

    申请号:US10761451

    申请日:2004-01-20

    IPC分类号: G10L15/22 G10L15/00

    摘要: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

    摘要翻译: 自动语音识别系统识别用户对规定文本的改变,并且推测这种改变是否由用户改变主意而产生,或者这些改变是否是识别错误的结果。 如果检测到识别错误,则系统使用用户校正的类型进行自身修改,以减少再次发生这种识别错误的可能性。 因此,该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

    Method and system for dynamically adjusted training for speech
recognition
    9.
    发明授权
    Method and system for dynamically adjusted training for speech recognition 失效
    用于语音识别的动态调整训练的方法和系统

    公开(公告)号:US5963903A

    公开(公告)日:1999-10-05

    申请号:US673435

    申请日:1996-06-28

    CPC分类号: G10L15/063 G10L2015/0635

    摘要: A method and system for dynamically selecting words for training a speech recognition system. The speech recognition system models each phoneme using a hidden Markov model and represents each word as a sequence of phonemes. The training system ranks each phoneme for each frame according to the probability that the corresponding codeword will be spoken as part of the phoneme. The training system collects spoken utterances for which the corresponding word is known. The training system then aligns the codewords of each utterance with the phoneme that it is recognized to be part of. The training system then calculates an average rank for each phoneme using the aligned codewords for the aligned frames. Finally, the training system selects words for training that contain phonemes with a low rank.

    摘要翻译: 一种用于动态选择用于训练语音识别系统的单词的方法和系统。 语音识别系统使用隐马尔科夫模型对每个音素进行建模,并将每个单词表示为音素序列。 训练系统根据将相应的码字作为音素的一部分被说出的概率,对每个帧的每个音素进行排序。 训练系统收集对应词语已知的口语说话。 然后,训练系统将每个话语的码字与被认为是其一部分的音素对齐。 训练系统然后使用对齐的帧的对齐码字来计算每个音素的平均等级。 最后,训练系统选择包含低等级音素的训练词。

    Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections
    10.
    发明授权
    Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections 有权
    自动语音识别学习使用分类和选择性并入用户发起的更正

    公开(公告)号:US08280733B2

    公开(公告)日:2012-10-02

    申请号:US12884434

    申请日:2010-09-17

    摘要: An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

    摘要翻译: 自动语音识别系统识别用户对规定文本的改变,并且推测这种改变是否由用户改变主意而产生,或者这些改变是否是识别错误的结果。 如果检测到识别错误,则系统使用用户校正的类型进行自身修改,以减少再次发生这种识别错误的可能性。 因此,该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。