Word generation for speech recognition

    公开(公告)号:US10134388B1

    公开(公告)日:2018-11-20

    申请号:US14757680

    申请日:2015-12-23

    摘要: An automatic speech recognition (ASR) system may add new words to an ASR system by identifying words with similar usage and replicating the variations of the identified words to create new words. A new word that is used similarly to a known word may be varied to create new word forms that are similar to the word forms of a known word. The new word forms may then be incorporated into an ASR model to allow the ASR system to recognize those words when they are detected in speech. Such a system may allow flexible incorporation and recognition of varied forms of new words entering a general lexicon.

    Active learning for lexical annotations
    2.
    发明授权
    Active learning for lexical annotations 有权
    积极学习词汇注释

    公开(公告)号:US09508341B1

    公开(公告)日:2016-11-29

    申请号:US14476075

    申请日:2014-09-03

    IPC分类号: G10L15/18 G10L13/00

    摘要: Features are disclosed for active learning to identify the words which are likely to improve the guessing and automatic speech recognition (ASR) after manual annotation. When a speech recognition system needs pronunciations for words, a lexicon is typically used. For unknown words, pronunciation-guessing (G2P) may be included to provide pronunciations in an unattended (e.g., automatic) fashion. However, having manually (e.g., by a human) annotated pronunciations provides better ASR than having automatic pronunciations that may, in some instances, be wrong. The included active learning features help to direct these limited annotation resources.

    摘要翻译: 公开了用于主动学习的特征以在手动注释之后识别可能改善猜测和自动语音识别(ASR)的单词。 当语音识别系统需要发音时,通常使用词典。 对于未知单词,可以包括发音猜测(G2P),以无人值守(例如,自动)的方式提供发音。 然而,手动(例如,由人类)注释的发音提供比具有在某些情况下是错误的自动发音更好的ASR。 包括的主动学习功能有助于指导这些有限的注释资源。

    Selective speech recognition scoring using articulatory features
    3.
    发明授权
    Selective speech recognition scoring using articulatory features 有权
    使用发音功能的选择性语音识别评分

    公开(公告)号:US09355636B1

    公开(公告)日:2016-05-31

    申请号:US14027828

    申请日:2013-09-16

    IPC分类号: G10L15/187 G10L15/14

    摘要: Features are provided for selectively scoring portions of user utterances based at least on articulatory features of the portions. One or more articulatory features of a portion of a user utterance can be determined. Acoustic models or subsets of individual acoustic model components (e.g., Gaussians or Gaussian mixture models) can be selected based on the articulatory features of the portion. The portion can then be scored using a selected acoustic model or subset of acoustic model components. The process may be repeated for the multiple portions of the utterance, and speech recognition results can be generated from the scored portions.

    摘要翻译: 提供了特征,用于至少基于部分的关节特征来选择性地评分用户话语的部分。 可以确定用户话语的一部分的一个或多个发音特征。 可以基于该部分的关节特征来选择单个声学模型分量(例如,高斯混合模型或高斯混合模型)的声学模型或子集。 然后可以使用选定的声学模型或声学模型组件的子集对该部分进行评分。 可以对话语的多个部分重复该过程,并且可以从刻痕部分产生语音识别结果。

    Predicting pronunciation in speech recognition

    公开(公告)号:US10339920B2

    公开(公告)日:2019-07-02

    申请号:US14196055

    申请日:2014-03-04

    摘要: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.

    PREDICTING PRONUNCIATION IN SPEECH RECOGNITION
    5.
    发明申请
    PREDICTING PRONUNCIATION IN SPEECH RECOGNITION 审中-公开
    预测语音识别中的授权

    公开(公告)号:US20150255069A1

    公开(公告)日:2015-09-10

    申请号:US14196055

    申请日:2014-03-04

    IPC分类号: G10L17/22 G10L15/08

    摘要: An automatic speech recognition (ASR) device may be configured to predict pronunciations of textual identifiers (for example, song names, etc.) based on predicting one or more languages of origin of the textual identifier. The one or more languages of origin may be determined based on the textual identifier. The pronunciations may include a hybrid pronunciation including a pronunciation in one language, a pronunciation in a second language and a hybrid pronunciation that combines multiple languages. The pronunciations may be added to a lexicon and matched to the content item (e.g., song) and/or textual identifier. The ASR device may receive a spoken utterance from a user requesting the ASR device to access the content item. The ASR device determines whether the spoken utterance matches one of the pronunciations of the content item in the lexicon. The ASR device then accesses the content when the spoken utterance matches one of the potential textual identifier pronunciations.

    摘要翻译: 自动语音识别(ASR)设备可以被配置为基于预测文本标识符的一个或多个原始语言来预测文本标识符(例如,歌曲名称等)的发音。 可以基于文本标识符来确定一个或多个来源的语言。 发音可以包括混合发音,包括一种语言的发音,第二语言的发音和组合多种语言的混合发音。 发音可以被添加到词典中并与内容项(例如,歌曲)和/或文本标识符匹配。 ASR设备可以从请求ASR设备的用户接收到该内容项的语音话语。 ASR设备确定口语话语是否匹配词典中内容项的发音之一。 ASR设备然后在口语发音与潜在的文本标识符发音之一匹配时访问该内容。

    Error reduction in speech processing

    公开(公告)号:US09697827B1

    公开(公告)日:2017-07-04

    申请号:US13711478

    申请日:2012-12-11

    CPC分类号: G10L15/18 G10L15/14 G10L15/19

    摘要: Features are disclosed for reducing errors in speech recognition processing. Methods for reducing errors can include receiving multiple speech recognition hypotheses based on an utterance indicative of a command or query of a user and determining a command or query within a grammar having a least amount of difference from one of the speech recognition hypotheses. The determination of the least amount of difference may be based at least in part on a comparison of individual subword units along at least some of the sequence paths of the speech recognition hypotheses and the grammar. For example, the comparison may be performed on the phoneme level instead of the word level.