TEXT-BASED UNSUPERVISED LEARNING OF LANGUAGE MODELS
    1.
    发明申请
    TEXT-BASED UNSUPERVISED LEARNING OF LANGUAGE MODELS 审中-公开
    基于文本的不一致的语言模型学习

    公开(公告)号:US20150254233A1

    公开(公告)日:2015-09-10

    申请号:US14198600

    申请日:2014-03-06

    CPC classification number: G06F17/2715

    Abstract: A method for constructing a language model for a domain, comprising incorporating textual terms related to the domain in language models having relevance to the domain that are constructed from clusters of textual data collected from a variety of sources, thus generating an adapted language model adapted for the domain, wherein the textual data is collected from the variety or sources by a computerized apparatus connectable to the variety or sources and wherein the method is performed on an at least one computerized apparatus configured to perform the method.

    Abstract translation: 一种用于构建域的语言模型的方法,包括将与所述域相关的文本术语包括在与从所述各种源收集的文本数据的集合构成的所述域相关的语言模型中,从而生成适于 所述域,其中所述文本数据通过可连接到所述种类或源的计算机化装置从所述种类或来源收集,并且其中所述方法在被配置为执行所述方法的至少一个计算机化装置上执行。

    OUT OF VOCABULARY PATTERN LEARNING
    2.
    发明申请
    OUT OF VOCABULARY PATTERN LEARNING 有权
    从语言学习方式出发

    公开(公告)号:US20160171973A1

    公开(公告)日:2016-06-16

    申请号:US14571347

    申请日:2014-12-16

    CPC classification number: G10L15/183 G10L15/146 G10L15/187

    Abstract: A method for adapting a speech recognition system for out-of-vocabulary, comprising, decoding by a hybrid speech recognition a speech including out-of-vocabulary terms, thereby generating graphemic transcriptions of the speech with a mixture of recognized in-vocabulary words and unrecognized sub-words, while keeping a track of the decoded segments of the speech, determining in the transcription sequences of sub-words as candidate out-of-vocabulary words based on a first condition with respect to lengths of the sequences of sub-words and a second condition with respect to the number of repetitions of the sequences, audibly presenting to a user the candidate out-of-vocabulary words from the corresponding segments of the speech according to the track, and receiving from the user indications of valid words corresponding to audible presentations of the sequences of sub-words in the candidate out-of-vocabulary words, and training a speech recognition to additionally recognize the candidate out-of-vocabulary words, thereby adapting the speech recognition to recognize out-of-vocabulary words, wherein the method is performed on an at least one computerized apparatus configured to perform the method, and an apparatus for performing the same.

    Abstract translation: 一种用于使语音识别系统适用于词汇外的方法,包括:通过混合语音识别解码包括词汇外的术语的语音,从而用识别的词汇单词的混合生成语音的字形转录;以及 无法识别的子词,同时保持语音的解码段的轨道,基于关于子词序列的长度的第一条件,将子词的转录序列确定为候选词外词 以及关于序列的重复次数的第二条件,根据该轨道向来自用户的语音的相应片段向用户可听地呈现候选的除了词汇词之外的条件,并且从用户接收对应的有效字的指示 对候选词外词语中的子词序列进行可听见的呈现,并且训练语音识别以另外识别候选词o 从而使语音识别适应于识别超出词汇的单词,其中该方法在被配置为执行该方法的至少一个计算机化设备上执行,以及用于执行该方法的设备。

    LANGUAGE MODEL ADAPTATION FOR SPECIFIC TEXTS
    3.
    发明申请
    LANGUAGE MODEL ADAPTATION FOR SPECIFIC TEXTS 有权
    用于特定文字的语言模式适应

    公开(公告)号:US20150370784A1

    公开(公告)日:2015-12-24

    申请号:US14307520

    申请日:2014-06-18

    Abstract: A computerized method for adapting a baseline language model, comprising obtaining a textual corpus of documents that comprise textual expressions, incorporating in the baseline language model textual expressions from documents which are determined as relevant to a provided target text based on a plurality of different relevancy determinations between the documents and the provided target text, thereby adapting the baseline language model to form an adapted language model for recognizing terms of a context of the provided target text, wherein the method is automatically performed on an at least one computerized apparatus configured to perform the method.

    Abstract translation: 一种用于调整基准语言模型的计算机化方法,包括获得包含文本表达的文档的文本语料库,其中包含基准语言模型的文本表达式,所述文本表达从基于多个不同相关性确定被确定为与所提供的目标文本相关的文档 在所述文档和所提供的目标文本之间,从而使所述基准语言模型适应形成用于识别所提供的目标文本的上下文的术语的适应语言模型,其中,所述方法在被配置为执行所述目标文本的至少一个计算机化设备 方法。

    SYSTEM AND METHOD FOR AUTOMATIC LANGUAGE MODEL SELECTION
    4.
    发明申请
    SYSTEM AND METHOD FOR AUTOMATIC LANGUAGE MODEL SELECTION 审中-公开
    自动语言模型选择的系统与方法

    公开(公告)号:US20160365093A1

    公开(公告)日:2016-12-15

    申请号:US14736282

    申请日:2015-06-11

    Inventor: Maor NISSAN

    CPC classification number: G10L15/187 G06F17/2735 G10L15/183

    Abstract: A system and method for generating a transcript of an audio input. An embodiment of a system and method may include generating a phonetic lattice by decoding the audio input and producing a transcription based on the phonetic lattice and based on a first language model. A transcription may be analyzed to produce analysis results. Analysis results may be used to select from a plurality of language models, one language model and the selected language model may be used to generate a transcript of the audio input.

    Abstract translation: 一种用于生成音频输入的录音的系统和方法。 系统和方法的实施例可以包括通过解码音频输入并基于语音格并基于第一语言模型产生转录来产生语音格子。 可以分析转录以产生分析结果。 分析结果可以用于从多个语言模型中选择,可以使用一种语言模型和所选择的语言模型来生成音频输入的抄本。

    SYSTEM AND METHOD FOR AUTOMATIC LANGUAGE MODEL GENERATION
    5.
    发明申请
    SYSTEM AND METHOD FOR AUTOMATIC LANGUAGE MODEL GENERATION 有权
    自动语言模型生成系统与方法

    公开(公告)号:US20160365090A1

    公开(公告)日:2016-12-15

    申请号:US14736277

    申请日:2015-06-11

    Inventor: Maor NISSAN

    CPC classification number: G10L15/19 G10L15/187

    Abstract: A computer-implemented method of generating a language model. An embodiment of a system and method may include selecting a set of words from a transcription of an audio input, the transcription produced by a current language model. The set of words may be used to obtain a set of content objects. The set of content objects may be used to generate a new language model. The current language model may be replaced by the new language model.

    Abstract translation: 一种计算机实现的生成语言模型的方法。 系统和方法的实施例可以包括从音频输入的转录中选择一组单词,由当前语言模型产生的转录。 可以使用该组词来获得一组内容对象。 内容对象集可以用于生成新的语言模型。 当前的语言模型可能被新的语言模型所取代。

Patent Agency Ranking