专利检索 cpc:"G10L2015/0633" 第 8 页

71.

发明申请
METHOD FOR BUILDING LANGUAGE MODEL, SPEECH RECOGNITION METHOD AND ELECTRONIC APPARATUS 有权
标题翻译：语言模型建立方法，语音识别方法和电子设备

公开(公告)号：US20150112679A1

公开(公告)日：2015-04-23

申请号：US14499261

申请日：2014-09-29

申请人： VIA Technologies, Inc.

发明人： Guo-Feng Zhang

IPC分类号： G10L15/06 , G10L15/00

CPC分类号： G10L15/187 , G10L15/063 , G10L15/14 , G10L15/26 , G10L2015/0633

摘要： A method for building a language model, a speech recognition method and an electronic apparatus are provided. The speech recognition method includes the following steps. Phonetic transcriptions of a speech signal are obtained from an acoustic model. Phonetic spellings matching the phonetic transcriptions are obtained according to the phonetic transcriptions and a syllable acoustic lexicon. According to the phonetic spellings, a plurality of text sequences and a plurality of text sequence probabilities are obtained from a language model. Each phonetic spelling is matched to a candidate sentence table; a word probability of each phonetic spelling matching a word in a sentence of the sentence table are obtained; and the word probabilities of the phonetic spellings are calculated so as to obtain the text sequence probabilities. The text sequence corresponding to a largest one of the sequence probabilities is selected as a recognition result of the speech signal.

摘要翻译： 提供了一种构建语言模型，语音识别方法和电子设备的方法。语音识别方法包括以下步骤。从声学模型获得语音信号的语音转录。根据语音转录和音节声学词典获得与语音转录匹配的拼音。根据语音拼写，从语言模型中获得多个文本序列和多个文本序列概率。每个语音拼写与候选句表匹配; 获得与句子表的句子中的单词匹配的每个语音拼写的单词概率; 并计算语音拼写的单词概率，以获得文本序列概率。选择与序列概率中最大的一个对应的文本序列作为语音信号的识别结果。

72.

发明授权
System and method for generating personal vocabulary from network data 有权
标题翻译：用于从网络数据生成个人词汇的系统和方法

公开(公告)号：US08990083B1

公开(公告)日：2015-03-24

申请号：US12571404

申请日：2009-09-30

申请人： Satish K. Gannu , Ashutosh A. Malegaonkar , Virgil N. Mihailovici

发明人： Satish K. Gannu , Ashutosh A. Malegaonkar , Virgil N. Mihailovici

IPC分类号： G10L15/00

CPC分类号： G10L15/06 , G10L2015/0633

摘要： A method is provided in one example and includes receiving data propagating in a network environment, and identifying selected words within the data based on a whitelist. The whitelist includes a plurality of designated words to be tagged. The method further includes assigning a weight to the selected words based on at least one characteristic associated with the data, and associating the selected words to an individual. A resultant composite is generated for the selected words that are tagged. In more specific embodiments, the resultant composite is partitioned amongst a plurality of individuals associated with the data propagating in the network environment. A social graph can be generated that identifies a relationship between a selected individual and the plurality of individuals based on a plurality of words exchanged between the selected individual and the plurality of individuals.

摘要翻译： 在一个示例中提供了一种方法，并且包括接收在网络环境中传播的数据，以及基于白名单来识别数据内的所选择的单词。白名单包括要标记的多个指定单词。所述方法还包括基于与所述数据相关联的至少一个特性来分配对所选择的单词的权重，以及将所选择的单词与个人相关联。为所标记的所选择的单词生成结果复合。在更具体的实施例中，所得到的复合物在与在网络环境中传播的数据相关联的多个个体之间被划分。可以生成基于在所选择的个体与多个个体之间交换的多个单词来识别所选个体与多个个体之间的关系的社交图。

73.

发明申请
Unsupervised Clustering of Dialogs Extracted from Released Application Logs 有权
标题翻译：从发布的应用程序日志提取的对话框的无监督聚类

公开(公告)号：US20150051910A1

公开(公告)日：2015-02-19

申请号：US13969825

申请日：2013-08-19

申请人： Nuance Communications, Inc.

发明人： Jean-Francois Lavallée

IPC分类号： G10L15/06

CPC分类号： G06F17/279 , G10L15/063 , G10L2015/0631 , G10L2015/0633

摘要： A natural language understanding system performs automatic unsupervised clustering of dialog data from a natural language dialog application. A log parser automatically extracts structured dialog data from application logs. A dialog generalizing module generalizes the extracted dialog data to generalization identifier vectors. A data clustering module automatically clusters the dialog data based on the generalization identifier vectors using an unsupervised density-based clustering algorithm without a predefined number of clusters and without a predefined distance threshold in an iterative approach based on a hierarchical ordering of the generalization.

摘要翻译： 自然语言理解系统执行自然语言对话应用程序对话数据的自动无监督聚类。日志解析器自动从应用程序日志提取结构化对话框数据。对话概括模块将提取的对话数据概括为泛化标识符向量。数据聚类模块基于泛化标识符向量，使用基于无监督密度的聚类算法自动聚类对话数据，而不使用预定义数量的聚类，并且基于泛化的分级排序，在迭代方法中没有预定义的距离阈值。

74.

发明申请
Discriminative Training of Document Transcription System 有权
标题翻译：文件转录系统的歧视性培训

公开(公告)号：US20140343939A1

公开(公告)日：2014-11-20

申请号：US14244053

申请日：2014-04-03

申请人： MModal IP LLC

发明人： Lambert Mathias , Girija Yegnanarayanan , Juergen Fritsch

IPC分类号： G10L15/06 , G10L15/26

CPC分类号： G10L15/063 , G06F17/271 , G06F17/2775 , G06F17/28 , G10L15/02 , G10L15/183 , G10L15/193 , G10L15/26 , G10L2015/0631 , G10L2015/0633

摘要： A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

摘要翻译： 提供用于训练用于语音识别的声学模型的系统。特别地，这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式，从而产生更准确地表示语音音频流的经修改的脚本。可以使用修改和更准确的抄本来使用辨别性训练技术训练声学模型，从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。

75.

发明申请
NAME RECOGNITION SYSTEM 有权
标题翻译：名称识别系统

公开(公告)号：US20130332164A1

公开(公告)日：2013-12-12

申请号：US13492720

申请日：2012-06-08

申请人： Devang K. Nalk

发明人： Devang K. Nalk

IPC分类号： G10L15/06

CPC分类号： G10L15/187 , G10L15/30 , G10L2015/025 , G10L2015/0633

摘要： A speech recognition system uses, in one embodiment, an extended phonetic dictionary that is obtained by processing words in a user's set of databases, such as a user's contacts database, with a set of pronunciation guessers. The speech recognition system can use a conventional phonetic dictionary and the extended phonetic dictionary to recognize speech inputs that are user requests to use the contacts database, for example, to make a phone call, etc. The extended phonetic dictionary can be updated in response to changes in the contacts database, and the set of pronunciation guessers can include pronunciation guessers for a plurality of locales, each locale having its own pronunciation guesser.

摘要翻译： 在一个实施例中，语音识别系统使用通过在用户的一组数据库（例如用户的联系人数据库）中处理单词与一组发音猜测器来获得的扩展语音字典。语音识别系统可以使用传统的语音字典和扩展语音字典来识别作为用户请求使用联系人数据库的语音输入，例如进行电话呼叫等。扩展的语音字典可以响应于联系人数据库中的变化和发音猜测器的集合可以包括多个语言环境的发音猜测器，每个语言环境具有其自己的发音猜测器。

76.

发明申请
Discriminative Training of Document Transcription System 有权
标题翻译：文件转录系统的歧视性培训

公开(公告)号：US20130166297A1

公开(公告)日：2013-06-27

申请号：US13773928

申请日：2013-02-22

申请人： MULTIMODAL TECHNOLOGIES, LLC

发明人： Lambert Mathias , Girija Yegnanarayanan , Juergen Fritsch

IPC分类号： G10L15/06

CPC分类号： G10L15/063 , G06F17/271 , G06F17/2775 , G06F17/28 , G10L15/02 , G10L15/183 , G10L15/193 , G10L15/26 , G10L2015/0631 , G10L2015/0633

摘要： A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

摘要翻译： 提供用于训练用于语音识别的声学模型的系统。特别地，这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式，从而产生更准确地表示语音音频流的经修改的脚本。可以使用修改和更准确的抄本来使用辨别性训练技术训练声学模型，从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。

77.

发明申请
OBJECT CLASSIFICATION/RECOGNITION APPARATUS AND METHOD 有权
标题翻译：对象分类/识别装置和方法

公开(公告)号：US20130163887A1

公开(公告)日：2013-06-27

申请号：US13724220

申请日：2012-12-21

申请人： HONDA MOTOR CO., LTD. , NATIONAL UNIVERSITY CORPORATION KOBE UNIVERSITY

发明人： Mikio NAKANO , Naoto IWAHASHI , Yasuo ARIKI , Yuko OZASA , Takahiro HORI , Ryohei NAKATANI

IPC分类号： G06K9/62

CPC分类号： G06K9/6267 , G06K9/6254 , G06K9/6277 , G06K9/6293 , G10L15/01 , G10L2015/025 , G10L2015/0633

摘要： An apparatus is provided for classifying targets into a known-object group and an unknown-object group. The apparatus includes a speech/image data storage unit configured to store a spoken sound of a name of an object and an image of the object; a unit configured to calculate a speech confidence level of a speech for the name of the object with reference to a spoken sound of a name of a known object; a unit configured to calculate an image confidence level of an image of an object with respect to an image of a known object; and a unit configured to compare an evaluation value, which is obtained by combining the speech confidence level and image confidence level, with a threshold value, and classify a target object into an object group determined according to whether the spoken sound of the name and the image are known or unknown.

摘要翻译： 提供了一种用于将目标分类为已知对象组和未知对象组的装置。该装置包括：语音/图像数据存储单元，被配置为存储对象的名称和对象的图像的口语声音; 参考已知对象的名称的口语声音，被配置为针对对象的名称计算语音的语音置信水平的单元; 被配置为计算相对于已知对象的图像的对象的图像的图像置信水平的单元; 以及被配置为将通过组合语音置信度和图像置信水平而获得的评估值与阈值进行比较的单元，并且将目标对象分类为根据姓名的语音确定的对象组和图像是已知或未知的。

78.

发明授权
Discriminative training of document transcription system 有权

公开(公告)号：US08412521B2

公开(公告)日：2013-04-02

申请号：US11228607

申请日：2005-09-16

申请人： Lambert Mathias , Girija Yegnanarayanan , Juergen Fritsch

发明人： Lambert Mathias , Girija Yegnanarayanan , Juergen Fritsch

IPC分类号： G10L15/26 , G10L15/18

CPC分类号： G10L15/063 , G06F17/271 , G06F17/2775 , G06F17/28 , G10L15/02 , G10L15/183 , G10L15/193 , G10L15/26 , G10L2015/0631 , G10L2015/0633

摘要： A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

79.

发明申请
SYSTEM AND METHOD FOR BUILDING DIVERSE LANGUAGE MODELS 有权
标题翻译：用于建立多元语言模型的系统和方法

公开(公告)号：US20120232885A1

公开(公告)日：2012-09-13

申请号：US13042890

申请日：2011-03-08

申请人： Luciano De Andrade BARBOSA , Srinivas BANGALORE

发明人： Luciano De Andrade BARBOSA , Srinivas BANGALORE

IPC分类号： G06F17/27 , G10L15/00

CPC分类号： G06F17/28 , G06F17/21 , G06F17/27 , G06F17/2705 , G06F17/2715 , G06F17/2735 , G06F17/2765 , G10L2015/0633

摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for collecting web data in order to create diverse language models. A system configured to practice the method first crawls, such as via a crawler operating on a computing device, a set of documents in a network of interconnected devices according to a visitation policy, wherein the visitation policy is configured to focus on novelty regions for a current language model built from previous crawling cycles by crawling documents whose vocabulary considered likely to fill gaps in the current language model. A language model from a previous cycle can be used to guide the creation of a language model in the following cycle. The novelty regions can include documents with high perplexity values over the current language model.

摘要翻译： 本文公开了用于收集网络数据以便创建不同语言模型的系统，方法和非暂时的计算机可读存储介质。被配置为实践该方法的系统首先通过根据访问策略的互连设备的网络中的诸如通过在计算设备上操作的爬行器来爬行一组文档，其中所述访问策略被配置为专注于新颖区域目前的语言模型是从以前的爬行周期构建的，通过抓取其词汇被认为可能填补当前语言模型的空白的文档。来自上一个循环的语言模型可用于指导在以下循环中创建语言模型。新奇区域可以包括与当前语言模型相比具有高困惑价值的文档。

80.

发明申请
Medical vocabulary templates in speech recognition 审中-公开
标题翻译：语音识别中的医学词汇模板

公开(公告)号：US20060241943A1

公开(公告)日：2006-10-26

申请号：US11477121

申请日：2006-06-29

申请人： Anuthep Benja-Athon , Sirikit Benja-Athon

发明人： Anuthep Benja-Athon , Sirikit Benja-Athon

IPC分类号： G10L15/26

CPC分类号： G06Q10/10 , G06Q50/22 , G10L15/06 , G10L15/285 , G10L25/48 , G10L2015/0633 , G10L2015/223

摘要： A system of templates of words and terms use in medicine and surgery by physicians for optimizing the outcomes of speech recognition process of converting digital voice data produced by physicians into digital text data comprises words and terms use in medical and surgical specialties. The medical and surgical vocabulary templates comprise individual logical arrangements or orders of related words and terms use by physicians to communicate and record health and health-care information and data.

摘要翻译： 词汇和术语的模板系统由医师用于医学和外科手术，用于优化将由医生产生的数字语音数据转换为数字文本数据的语音识别过程的结果，包括在医学和外科专业中使用的词语和术语。医学和外科术语词汇模板包括个人的逻辑安排或相关词汇的命令以及医生用来沟通和记录健康和保健信息和数据的术语。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类