专利检索 ap:("Dong Yu" OR "Peter Mau" OR "Mei-Yuh Hwang" OR "Alejandro Acero") AND inv:"Mei-Yuh Hwang" 第 1 页

1.

发明授权
Automatic speech recognition learning using categorization and selective incorporation of user-initiated corrections 有权
标题翻译：自动语音识别学习使用分类和选择性并入用户发起的更正

公开(公告)号：US08280733B2

公开(公告)日：2012-10-02

申请号：US12884434

申请日：2010-09-17

申请人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

发明人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

IPC分类号： G10L15/00 , G10L15/06 , G10L15/04 , G10L15/14 , G10L21/00

CPC分类号： G10L15/065 , G10L15/063 , G10L2015/0631

摘要： An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

摘要翻译： 自动语音识别系统识别用户对规定文本的改变，并且推测这种改变是否由用户改变主意而产生，或者这些改变是否是识别错误的结果。如果检测到识别错误，则系统使用用户校正的类型进行自身修改，以减少再次发生这种识别错误的可能性。因此，该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

2.

发明授权
Automatic speech recognition learning using user corrections 有权
标题翻译：自动语音识别学习使用用户更正

公开(公告)号：US08019602B2

公开(公告)日：2011-09-13

申请号：US10761451

申请日：2004-01-20

申请人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

发明人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

IPC分类号： G10L15/00 , G10L15/26 , G10L21/00

CPC分类号： G10L15/065 , G10L15/063 , G10L2015/0631

摘要： An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

摘要翻译： 自动语音识别系统识别用户对规定文本的改变，并且推测这种改变是否由用户改变主意而产生，或者这些改变是否是识别错误的结果。如果检测到识别错误，则系统使用用户校正的类型进行自身修改，以减少再次发生这种识别错误的可能性。因此，该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

3.

发明申请
SYSTEM AND METHOD FOR EFFICIENT LASER PROCESSING OF A MOVING WEB-BASED MATERIAL 有权
标题翻译：用于基于网络的移动材料的高效激光加工的系统和方法

公开(公告)号：US20110015927A1

公开(公告)日：2011-01-20

申请号：US12884434

申请日：2010-09-17

申请人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

发明人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

IPC分类号： G10L15/26

CPC分类号： G10L15/065 , G10L15/063 , G10L2015/0631

摘要： An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

摘要翻译： 自动语音识别系统识别用户对规定文本的改变，并且推测这种改变是否由用户改变主意而产生，或者这些改变是否是识别错误的结果。如果检测到识别错误，则系统使用用户校正的类型进行自身修改，以减少再次发生这种识别错误的可能性。因此，该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

4.

发明申请
Automatic speech recognition learning using user corrections 有权
标题翻译：自动语音识别学习使用用户更正

公开(公告)号：US20050159949A1

公开(公告)日：2005-07-21

申请号：US10761451

申请日：2004-01-20

申请人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

发明人： Dong Yu , Peter Mau , Mei-Yuh Hwang , Alejandro Acero

IPC分类号： G10L15/22 , G10L15/00

CPC分类号： G10L15/065 , G10L15/063 , G10L2015/0631

摘要： An automatic speech recognition system recognizes user changes to dictated text and infers whether such changes result from the user changing his/her mind, or whether such changes are a result of a recognition error. If a recognition error is detected, the system uses the type of user correction to modify itself to reduce the chance that such recognition error will occur again. Accordingly, the system and methods provide for significant speech recognition learning with little or no additional user interaction.

摘要翻译： 自动语音识别系统识别用户对规定文本的改变，并且推测这种改变是否由用户改变主意而产生，或者这些改变是否是识别错误的结果。如果检测到识别错误，则系统使用用户校正的类型进行自身修改，以减少再次发生这种识别错误的可能性。因此，该系统和方法提供了很少或没有额外的用户交互的重要语音识别学习。

5.

发明授权
Modelling and processing filled pauses and noises in speech recognition 失效
标题翻译：在语音识别中建模和处理填充的暂停和噪声

公开(公告)号：US07076422B2

公开(公告)日：2006-07-11

申请号：US10388259

申请日：2003-03-13

申请人： Mei-Yuh Hwang

发明人： Mei-Yuh Hwang

IPC分类号： G10L15/20

CPC分类号： G10L15/142 , G10L2021/02168

摘要： A speech recognition system recognizes filled pause utterances made by a speaker. In one embodiment, an ergodic model is used to acoustically model filled pauses that provides flexibility allowing varying utterances of the filled pauses to be made. The ergodic HMM model can also be used for other types of noise such as but limited to breathing, keyboard operation, microphone noise, laughter, door openings and/or closings, or any other noise occurring in the environment of the user or made by the user. Similarly, silence can be modeled using an ergodic HMM model. Recognition can be used with N-gram, context-free grammar or hybrid language models.

摘要翻译： 语音识别系统识别扬声器产生的填充暂停发声。在一个实施例中，遍历模型用于声学地建模填充暂停，其提供灵活性，允许进行填充暂停的变化的话语。遍历式HMM模型还可以用于其他类型的噪声，例如但不限于呼吸，键盘操作，麦克风噪音，笑声，门开启和/或关闭，或者在用户的环境中发生的或由用户。类似地，可以使用遍历HMM模型来建模沉默。识别可以与N-gram，上下文无关的语法或混合语言模型一起使用。

6.

发明授权
Method for adding phonetic descriptions to a speech recognition lexicon 失效
标题翻译：将语音描述添加到语音识别词典中的方法

公开(公告)号：US06973427B2

公开(公告)日：2005-12-06

申请号：US09748453

申请日：2000-12-26

申请人： Mei-Yuh Hwang , Fileno A. Alleva , Rebecca C. Weiss

发明人： Mei-Yuh Hwang , Fileno A. Alleva , Rebecca C. Weiss

IPC分类号： G10L15/06 , G06F17/21 , G01L15/04 , G01L15/06 , G01L15/08

CPC分类号： G10L15/063 , G10L2015/0636

摘要： A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, two possible phonetic descriptions are generated. One phonetic description is formed from the text of the word. The other phonetic description is formed by decoding a speech signal representing the user's pronunciation of the word. Both phonetic descriptions are scored based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.

摘要翻译： 一种方法和计算机可读介质将单词的文本和用户的该单词的发音转换成要添加到语音识别词典的语音描述中。最初，会生成两个可能的语音描述。一个语音描述从单词的文字形成。另一个语音描述是通过对表示用户对该单词的发音的语音信号进行解码形成的。基于与用户发音的对应关系，语音描述都得分。然后选择具有最高分数的语音描述，用于语音识别词典中的输入。

7.

发明授权
Speech recognition with mixtures of bayesian networks 有权
标题翻译：语音识别与贝叶斯网络的混合

公开(公告)号：US06336108B1

公开(公告)日：2002-01-01

申请号：US09220197

申请日：1998-12-23

申请人： Bo Thiesson , Christopher A. Meek , David Maxwell Chickering , David Earl Heckerman , Fileno A. Alleva , Mei-Yuh Hwang

发明人： Bo Thiesson , Christopher A. Meek , David Maxwell Chickering , David Earl Heckerman , Fileno A. Alleva , Mei-Yuh Hwang

IPC分类号： G06F1518

CPC分类号： G06K9/6296 , G06N5/025 , Y10S707/99945 , Y10S707/99948

摘要： The invention performs speech recognition using an array of mixtures of Bayesian networks. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN models the world under the hypothesis that the common external hidden variable is in a corresponding one of those states. In accordance with the invention, the MBNs encode the probabilities of observing the sets of acoustic observations given the utterance of a respective one of said parts of speech. Each of the HSBNs encodes the probabilities of observing the sets of acoustic observations given the utterance of a respective one of the parts of speech and given a hidden common variable being in a particular state. Each HSBN has nodes corresponding to the elements of the acoustic observations. These nodes store probability parameters corresponding to the probabilities with causal links representing dependencies between ones of said nodes.

摘要翻译： 本发明使用贝叶斯网络混合的阵列来执行语音识别。贝叶斯网络（MBN）的混合由多个具有隐藏和观察变量的假设特定贝叶斯网络（HSBN）组成。常见的外部隐藏变量与MBN相关联，但不包括在任何HSBN中。 MBN中的HSBN的数量对应于共同外部隐藏变量的状态数，并且每个HSBN在假设下共同的外部隐藏变量处于相应的一个状态的假设下对世界进行建模。根据本发明，MBN编码了考虑到所述话音部分中的相应一个的话语来观察声学观测组的概率。每个HSBN编码观察给定语音相应的一个语音的发音并给出隐藏的公共变量处于特定状态的声学观察组的概率。每个HSBN具有对应于声学观测元素的节点。这些节点存储对应于概率的概率参数，其中因果链接表示所述节点之间的依赖关系。

8.

发明授权
New-word pronunciation learning using a pronunciation graph 失效
标题翻译：新词发音学习使用发音图

公开(公告)号：US07590533B2

公开(公告)日：2009-09-15

申请号：US10796921

申请日：2004-03-10

申请人： Mei-Yuh Hwang

发明人： Mei-Yuh Hwang

IPC分类号： G10L15/00

CPC分类号： G10L15/063 , G10L15/187 , G10L2015/025

摘要： A method and computer-readable medium convert the text of a word and a user's pronunciation of the word into a phonetic description to be added to a speech recognition lexicon. Initially, a plurality of at least two possible phonetic descriptions are generated. One phonetic description is formed by decoding a speech signal representing a user's pronunciation of the word. At least one other phonetic description is generated from the text of the word. The plurality of possible sequences comprising speech-based and text-based phonetic descriptions are aligned and scored in a single graph based on their correspondence to the user's pronunciation. The phonetic description with the highest score is then selected for entry in the speech recognition lexicon.

摘要翻译： 一种方法和计算机可读介质将单词的文本和用户的该单词的发音转换成要添加到语音识别词典的语音描述中。最初，产生多个至少两个可能的语音描述。通过对表示用户的单词发音的语音信号进行解码来形成一个语音描述。从单词的文本生成至少一个其他语音描述。基于其与用户的发音的对应关系，包括基于语音和基于文本的语音描述的多个可能序列在单个图中对齐和记分。然后选择具有最高分数的语音描述，用于语音识别词典中的输入。

9.

发明授权
Speech recognition system for recognizing continuous and isolated speech 失效
标题翻译：用于识别连续和孤立语音的语音识别系统

公开(公告)号：US6076056A

公开(公告)日：2000-06-13

申请号：US934622

申请日：1997-09-19

申请人： Xuedong D. Huang , Fileno A. Alleva , Li Jiang , Mei-Yuh Hwang

发明人： Xuedong D. Huang , Fileno A. Alleva , Li Jiang , Mei-Yuh Hwang

IPC分类号： G10L15/02 , G10L15/04 , G10L15/06 , G10L15/08 , G10L15/14 , G10L15/28

CPC分类号： G10L15/08 , G10L15/05

摘要： Speech recognition is performed by receiving isolated speech training data indicative of a plurality of discretely spoken training words, and receiving continuous speech training data indicative of a plurality of continuously spoken training words. A plurality of speech unit models is trained based on the isolated speech training data and the continuous speech training data. Speech is recognized based on the speech unit models trained.

摘要翻译： 通过接收指示多个离散讲话的训练词的孤立语音训练数据，以及接收指示多个连续讲话的训练词的连续语音训练数据来执行语音识别。基于孤立语音训练数据和连续语音训练数据来训练多个语音单元模型。基于训练的语音单元模型识别语音。

10.

发明授权
Generating large units of graphonemes with mutual information criterion for letter to sound conversion 失效
标题翻译：生成具有相互信息标准的大单位图形，用于字母转换

公开(公告)号：US07693715B2

公开(公告)日：2010-04-06

申请号：US10797358

申请日：2004-03-10

申请人： Mei-Yuh Hwang , Li Jiang

发明人： Mei-Yuh Hwang , Li Jiang

IPC分类号： G10L15/04

CPC分类号： G10L13/08

摘要： A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.

摘要翻译： 提供了一种用于将单词分割成组成部分的方法和装置。根据本发明，确定在一组单词中发现的一对图形单元的互信息得分。每个图形单元至少包含一个字母。基于相互信息得分组合一对图形单元的图形单位。这形成一个新的图形单元。在本发明的一个方面，使用相互信息将已经被分段成音节的单词训练在一个音节的n-gram模型上。音节n-gram模型用于将新词的语音表示分割成音节。类似地，使用相互信息形成语素的清单，并且训练语素n-gram，其可以用于将新单词分割成语素序列。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类