Method and apparatus for discriminative utterance verification using
multiple confidence measures
    1.
    发明授权
    Method and apparatus for discriminative utterance verification using multiple confidence measures 失效
    使用多重置信度测度的辨别性话语验证的方法和装置

    公开(公告)号:US6125345A

    公开(公告)日:2000-09-26

    申请号:US934056

    申请日:1997-09-19

    IPC分类号: G10L15/10 G10L5/06 G10L9/00

    CPC分类号: G10L15/10

    摘要: A multiple confidence measures subsystem of an automated speech recognition system allows otherwise independent confidence measures to be integrated and used for both training and testing on a consistent basis. Speech to be recognized is input to a speech recognizer and a recognition verifier of the multiple confidence measures subsystem. The speech recognizer generates one or more confidence measures. The speech recognizer preferably generates a misclassification error (MCE) distance as one of the confidence measures. The recognized speech output by the speech recognizer is input to the recognition verifier, which outputs one or more confidence measures. The recognition verifier preferably outputs a misverification error (MVE) distance as one of the confidence measures. The confidence measures output by the speech recognizer and the recognition verifier are normalized and then input to an integrator. The integrator integrates the various confidence measures during both a training phase for the hidden Markov models implemented in the speech recognizer and the recognition verifier and during testing of the input speech. The integrator is preferably implemented using a multi-layer perceptron (MLP). The output of the integrator, rather than the recognition verifier, determines whether the recognized utterance hypothesis generated by the speech recognizer should be accepted or rejected.

    摘要翻译: 自动化语音识别系统的多重置信度子系统允许另外独立的置信度度量被一体化地整合并用于训练和测试。 要识别的语音被输入到多个置信度度子系统的语音识别器和识别验证器。 语音识别器生成一个或多个置信度量。 语音识别器优选地产生误分类误差(MCE)距离作为置信度测量之一。 由语音识别器输出的识别语音输入到识别验证器,该校验器输出一个或多个置信度量。 识别验证器优选地输出误差误差(MVE)距离作为置信度测量之一。 由语音识别器和识别验证器输出的置信度被归一化,然后输入到积分器。 在语音识别器和识别验证器中实现的隐马尔科夫模型的训练阶段和输入语音测试期间,积分器集成了各种置信度度量。 积分器优选地使用多层感知器(MLP)来实现。 积分器的输出而不是识别验证器确定是否应该接受或拒绝由语音识别器生成的识别的话语假设。

    Speech spurt detecting apparatus and method with threshold adapted by
noise and speech statistics
    3.
    发明授权
    Speech spurt detecting apparatus and method with threshold adapted by noise and speech statistics 失效
    语音突发检测装置和方法,具有噪声和语音统计的阈值

    公开(公告)号:US6044342A

    公开(公告)日:2000-03-28

    申请号:US978481

    申请日:1997-11-25

    CPC分类号: G10L25/78

    摘要: A speech spurt detecting apparatus for detecting speech spurts in a voice signal has a storage for storing an input voice signal. A decision portion determines speech spurt sections and mute sections using a threshold value and sets one of the mute sections at a latter part of a hangover time. A mute level statistical processor estimates the noise distribution of a signal in the mute sections. A speech spurt detecting threshold value decision portion receives the average and the variance of the noise distribution from the mute level statistical processor and approximates the noise distribution to a gamma distribution to decide a speech spurt detecting threshold. A speech spurt transmitting portion outputs the voice signal in the speech spurt sections from the storage. A speech spurt level statistical processor carries out statistical processing of the speech spurt sections. The speech spurt detecting threshold value decision portion detects an error of the speech spurt detecting threshold value using the speech spurt level statistical processor and the mute level statistical processor and resets the speech spurt detecting threshold value to its initial value if the error exceeds a predetermined value. The speech spurt detecting threshold value decision portion increases the speech spurt detecting threshold value at a fixed rate in each of the speech spurt sections, and computes (the average).sup.2 /(the variance) to obtain an adjusting coefficient and computes (the adjusting coefficient).times.(the average) to obtain the speech spurt detecting threshold value.

    摘要翻译: 用于检测语音信号中的语音喷发的语音突发检测装置具有用于存储输入语音信号的存储器。 决定部分使用阈值确定语音突发部分和静音部分,并且在宿醉时间的后半部分设置静音部分中的一个。 静音级统计处理器估计静音部分中的信号的噪声分布。 语音突发检测阈值判定部分从静音级统计处理器接收噪声分布的平均值和方差,并将噪声分布近似为伽马分布,以决定语音突发检测阈值。 话音突发发送部分从存储器输出语音突发部分中的语音信号。 语音突发等级统计处理器执行语音突发部分的统计处理。 语音突发检测阈值判定部使用语音突发级统计处理器和静音级统计处理器来检测语音突发检测阈值的误差,并且如果误差超过预定值则将语音突发检测阈值重置为其初始值 。 语音突发检测阈值判定部分在每个语音突发部分中以固定速率增加语音突起检测阈值,并且计算(平均)2 /(方差)以获得调整系数并计算(调整系数 )x(平均),以获得语音突发检测阈值。

    Hypertext navigation system controlled by spoken words
    4.
    发明授权
    Hypertext navigation system controlled by spoken words 失效
    超文本导航系统由口语控制

    公开(公告)号:US6029135A

    公开(公告)日:2000-02-22

    申请号:US557525

    申请日:1995-11-14

    摘要: A hypertext navigation system that is controllable by spoken words have hypertext documents to which specific dictionaries and probability models for assisting in an acoustic voice recognition of hyper-links of this hypertext document are allocated. Control of a hypertext viewer or, respectively, browser and navigation in the hypertext document or hypertext system by pronouncing links is provided. The voice recognition is thereby optimally adapted to the links to be recognized without these having to be previously known.

    摘要翻译: 可由口语控制的超文本导航系统具有超文本文件,分配了用于辅助该超文本文件的超链接的声学语音识别的特定词典和概率模型。 提供了超文本阅读器的控制,或分别通过发音链接超文本文件或超文本系统中的浏览器和导航。 因此,语音识别最佳地适应于要被识别的链接,而不必先前已知。

    Tree structured cohort selection for speaker recognition system
    5.
    发明授权
    Tree structured cohort selection for speaker recognition system 失效
    用于说话者识别系统的树结构队列选择

    公开(公告)号:US6006184A

    公开(公告)日:1999-12-21

    申请号:US14565

    申请日:1998-01-28

    CPC分类号: G10L17/12 G10L15/08

    摘要: In a speaker recognition system, a tree-structured reference pattern storing unit has first through M-th node stages each of which has nodes that respectively store a reference pattern of inhibiting speakers. The reference pattern of each node of (N-1)-th node stage represents acoustic features in the reference patterns of predetermined ones of the nodes of the N-th node stage. An analysis unit analyzes input speech and converts the input speech into feature vectors. A similarities calculating unit calculates similarities between the feature vectors and the reference patterns of all of the inhibiting speakers. An inhibiting speaker selecting unit sorts the similarities and selects a predetermined number of inhibiting speakers. The similarities calculating unit calculates the similarity of the node of the first node stage and calculates the similarities of ones of the nodes of the N-th node stage which are connected to a predetermined number of nodes of the (N-1)-th node stage, selected in an order based on highest similarities.

    摘要翻译: 在扬声器识别系统中,树结构参考模式存储单元具有第一到第M个节点级,每个节点具有分别存储抑制扬声器参考模式的节点。 (N-1)节点级的每个节点的参考模式表示第N个节点的预定节点的参考模式中的声学特征。 分析单元分析输入语音并将输入语音转换为特征向量。 相似度计算单元计算所有禁止说话者的特征向量和参考图案之间的相似度。 抑制扬声器选择单元对相似性进行排序并选择预定数量的抑制扬声器。 相似度计算单元计算第一节点级的节点的相似度,并且计算连接到第(N-1)个节点的预定数量的节点的第N个节点级的节点之间的相似度 阶段,按照最高相似度的顺序进行选择。

    Speech recognition rejection method using generalized additive models
    6.
    发明授权
    Speech recognition rejection method using generalized additive models 失效
    使用广义加法模型的语音识别拒绝方法

    公开(公告)号:US6006182A

    公开(公告)日:1999-12-21

    申请号:US934892

    申请日:1997-09-22

    IPC分类号: G10L15/14 G10L5/06 G10L9/00

    CPC分类号: G10L15/142

    摘要: Systems and methods consistent with the present invention determine whether to accept one of a plurality of intermediate recognition results output by a speech recognition system as a final recognition result. The system first combines a plurality of speech rejection features into a feature function in which weights are assigned to each rejection feature in accordance with a recognition accuracy of each rejection feature. Feature values are then calculated for each of the rejection features using the plurality of intermediate recognition results. The system next computes the feature function according to the calculated feature values to determine a rejection decision value. Finally, one of the plurality of intermediate recognition results is accepted as the final recognition result according to the rejection decision value.

    摘要翻译: 与本发明一致的系统和方法确定是否接受由语音识别系统输出的多个中间识别结果中的一个作为最终识别结果。 该系统首先将多个语音抑制特征组合成特征功能,其中根据每个拒绝特征的识别精度将权重分配给每个拒绝特征。 然后使用多个中间识别结果为每个拒绝特征计算特征值。 系统接下来根据计算的特征值计算特征函数以确定拒绝判定值。 最后,多个中间识别结果之一被接受为根据拒绝判定值的最终识别结果。

    System and method for creating a language grammar using a spreadsheet or
table interface
    7.
    发明授权
    System and method for creating a language grammar using a spreadsheet or table interface 失效
    使用电子表格或表格界面创建语言语法的系统和方法

    公开(公告)号:US5995918A

    公开(公告)日:1999-11-30

    申请号:US932937

    申请日:1997-09-17

    摘要: The present invention is a computer software system that allows the developer of a speech-enabled system to create a grammar and corpus for use in the system. A table interface is used, and phrases in the grammar are entered into cells in the table. The table also includes token data which corresponds to each valid utterance. When the grammar is defined, the computer software system automatically traverses the table to enumerate all possible valid utterances in the grammar. This traversal generates a listing (corpus) of valid utterances and their respective tokens. This listing can then be used to interpret spoken utterances for a speech-enabled system. The computer software system also transcribes the grammar rules found in the table to a format compatible with a variety of supported commercially-available speech recognizers.

    摘要翻译: 本发明是一种计算机软件系统,其允许支持语音的系统的开发者创建用于该系统的语法和语料库。 使用表格界面,并将语法中的短语输入表格中的单元格。 该表还包括对应于每个有效话语的令牌数据。 当语法被定义时,计算机软件系统自动地遍历表格以列举语法中的所有可能的有效话语。 这种遍历产生一个有效的话语列表(语料库)及其各自的令牌。 然后,该列表可用于解释语音使能系统的口语话语。 计算机软件系统还将表中发现的语法规则转录成与各种可支持的市售语音识别器兼容的格式。

    Method and apparatus for obtaining transcriptions from multiple training
utterances
    8.
    发明授权
    Method and apparatus for obtaining transcriptions from multiple training utterances 失效
    用于从多个训练语句获得转录的方法和装置

    公开(公告)号:US5983177A

    公开(公告)日:1999-11-09

    申请号:US994007

    申请日:1997-12-18

    IPC分类号: G10L5/06

    CPC分类号: G10L15/06 G10L15/187

    摘要: The invention relates to a method and an apparatus for adding a new entry to a speech recognition dictionary, more particularly to a system and method for generating transcriptions from multiple utterances of a given word. The novel method and apparatus automatically transcribes several training utterances into transcriptions without knowledge of the orthography of the word being added. It also provides a method and apparatus for transcribing multiple utterances into a single transcription that can be added to a speech recognition dictionary. In a first step, each utterance is analyzed individually to get their respective acoustic characteristics. Following this, these characteristics are combined to generate a set of the most likely transcriptions using the acoustic information obtained from each of the training utterances.

    摘要翻译: 本发明涉及一种用于将新条目添加到语音识别词典的方法和装置,更具体地说,涉及一种用于从给定单词的多个话语生成转录的系统和方法。 该新颖的方法和设备在不知道所添加的单词的正字法的情况下,自动将多个训练语言转录成转录。 它还提供了一种用于将多个话语转录成可以添加到语音识别词典的单个转录中的方法和装置。 在第一步中,单独分析每个话语以获得它们各自的声学特性。 接下来,这些特征被组合以使用从每个训练话语获得的声学信息来产生一组最可能的转录。

    Speech recognition training
    9.
    发明授权
    Speech recognition training 失效
    语音识别训练

    公开(公告)号:US5963906A

    公开(公告)日:1999-10-05

    申请号:US859360

    申请日:1997-05-20

    申请人: William Turin

    发明人: William Turin

    IPC分类号: G10L15/14 G10L5/06 G10L9/00

    CPC分类号: G10L15/144

    摘要: A method and system performs speech recognition training using Hidden Markov Models. Initially, preprocessed speech signals that include a plurality of observations are stored by the system. Initial Hidden Markov Model (HMM) parameters are then assigned. Summations are then calculated using modified equations derived substantially from the following equations, wherein u.ltoreq.v

    摘要翻译: 一种方法和系统使用隐马尔科夫模型进行语音识别训练。 最初,由系统存储包括多个观测值的预处理语音信号。 然后分配初始隐马尔可夫模型(HMM)参数。 然后使用基本上由以下等式导出的修正公式计算求和,其中u = v(xv)= P(xuv)P(xv + 1w)和OMEGA ij(xuw)= OMEGA ij(xuv) P(xv + 1w)+ P(xuv)OMEGA ij(xv + 1w)然后使用计算的求和来执行HMM参数重新估计。 然后确定HMM参数是否收敛。 如果有,则存储HMM参数。 然而,如果HMM参数没有收敛,则系统再次计算求和,使用求和进行HMM参数重新估计,并确定参数是否收敛。 迭代重复该过程,直到HMM参数收敛。

    System and method of recognizing an acoustic environment to adapt a set
of based recognition models to the current acoustic environment for
subsequent speech recognition
    10.
    发明授权
    System and method of recognizing an acoustic environment to adapt a set of based recognition models to the current acoustic environment for subsequent speech recognition 失效
    识别声学环境以使一组基于识别模型适应于当前声学环境以用于随后的语音识别的系统和方法

    公开(公告)号:US5960397A

    公开(公告)日:1999-09-28

    申请号:US863927

    申请日:1997-05-27

    申请人: Mazin G. Rahim

    发明人: Mazin G. Rahim

    摘要: A speech recognition system which effectively recognizes unknown speech from multiple acoustic environments includes a set of secondary models, each associated with one or more particular acoustic environments, integrated with a base set of recognition models. The speech recognition system is trained by making a set of secondary models in a first stage of training, and integrating the set of secondary models with a base set of recognition models in a second stage of training.

    摘要翻译: 有效地识别来自多个声学环境的未知语音的语音识别系统包括与一组或多个识别模型集成的一组次要模型,每个次要模型与一个或多个特定声学环境相关联。 语音识别系统通过在第一阶段的训练中形成一组次级模型进行训练,并将第二级模型集合与第二阶段训练中的识别模型的基本集合进行训练。