System and method for collaborative language translation
    2.
    发明授权
    System and method for collaborative language translation 有权
    用于协同语言翻译的系统和方法

    公开(公告)号:US09323746B2

    公开(公告)日:2016-04-26

    申请号:US13311836

    申请日:2011-12-06

    IPC分类号: G06F17/28

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for presenting a machine translation and alternative translations to a user, where a selection of any particular alternative translation results in the re-ranking of the remaining alternatives. The system then presents these re-ranked alternatives to the user, who can continue proofing the machine translation using the re-ranked alternatives or by typing an improved translation. This process continues until the user indicates that the current portion of the translation is complete, at which point the system moves to the next portion.

    摘要翻译: 本文公开了用于向用户呈现机器翻译和替代翻译的系统,方法和非暂时的计算机可读存储介质,其中任何特定替代翻译的选择导致其余替代方案的重新排序。 然后,该系统将这些重新排列的替代品呈现给用户,他们可以使用重新排列的替代品或通过输入改进的翻译来继续打印机器翻译。 该过程继续,直到用户指示翻译的当前部分完成,在该点系统移动到下一部分。

    System and method for feature-rich continuous space language models
    3.
    发明授权
    System and method for feature-rich continuous space language models 有权
    功能丰富的连续空间语言模型的系统和方法

    公开(公告)号:US09092425B2

    公开(公告)日:2015-07-28

    申请号:US12963161

    申请日:2010-12-08

    IPC分类号: G06F17/27 G06F17/28

    CPC分类号: G06F17/28

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for predicting probabilities of words for a language model. An exemplary system configured to practice the method receives a sequence of words and external data associated with the sequence of words and maps the sequence of words to an X-dimensional vector, corresponding to a vocabulary size. Then the system processes each X-dimensional vector, based on the external data, to generate respective Y-dimensional vectors, wherein each Y-dimensional vector represents a dense continuous space, and outputs at least one next word predicted to follow the sequence of words based on the respective Y-dimensional vectors. The X-dimensional vector, which is a binary sparse representation, can be higher dimensional than the Y-dimensional vector, which is a dense continuous space. The external data can include part-of-speech tags, topic information, word similarity, word relationships, a particular topic, and succeeding parts of speech in a given history.

    摘要翻译: 这里公开了用于预测语言模型的单词概率的系统,方法和非暂时的计算机可读存储介质。 配置为实施该方法的示例性系统接收与该单词序列相关联的单词序列和外部数据序列,并将该单词序列映射到对应于词汇大小的X维向量。 然后系统根据外部数据对每个X维向量进行处理,以产生各自的Y维向量,其中每个Y维向量表示密集的连续空间,并且输出至少一个预测的下一个单词以跟随单词序列 基于相应的Y维向量。 作为二进制稀疏表示的X维向量可以比作为密集连续空间的Y维向量更高的维度。 外部数据可以包括在给定历史中的部分词汇标签,主题信息,单词相似性,单词关系,特定主题以及后续部分语音。

    System and method for building diverse language models
    4.
    发明授权
    System and method for building diverse language models 有权
    建立不同语言模型的系统和方法

    公开(公告)号:US09081760B2

    公开(公告)日:2015-07-14

    申请号:US13042890

    申请日:2011-03-08

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for collecting web data in order to create diverse language models. A system configured to practice the method first crawls, such as via a crawler operating on a computing device, a set of documents in a network of interconnected devices according to a visitation policy, wherein the visitation policy is configured to focus on novelty regions for a current language model built from previous crawling cycles by crawling documents whose vocabulary considered likely to fill gaps in the current language model. A language model from a previous cycle can be used to guide the creation of a language model in the following cycle. The novelty regions can include documents with high perplexity values over the current language model.

    摘要翻译: 本文公开了用于收集网络数据以便创建不同语言模型的系统,方法和非暂时的计算机可读存储介质。 被配置为实践该方法的系统首先通过根据访问策略的互连设备的网络中的诸如通过在计算设备上操作的爬行器来爬行一组文档,其中所述访问策略被配置为专注于新颖区域 目前的语言模型是从以前的爬行周期构建的,通过抓取其词汇被认为可能填补当前语言模型的空白的文档。 来自上一个循环的语言模型可用于指导在以下循环中创建语言模型。 新奇区域可以包括与当前语言模型相比具有高困惑价值的文档。

    System and method of providing machine translation from a source language to a target language
    5.
    发明授权
    System and method of providing machine translation from a source language to a target language 有权
    提供从源语言到目标语言的机器翻译的系统和方法

    公开(公告)号:US08849665B2

    公开(公告)日:2014-09-30

    申请号:US12022819

    申请日:2008-01-30

    IPC分类号: G10L15/00 G10L15/18 G06F17/28

    CPC分类号: G06F17/2827

    摘要: A machine translation method, system for using the method, and computer readable media are disclosed. The method includes the steps of receiving a source language sentence, selecting a set of target language n-grams using a lexical classifier and based on the source language sentence. When selecting the set of target language n-grams, in at least one n-gram, n is greater than 1. The method continues by combining the selected set of target language n-grams as a finite state acceptor (FSA), weighting the FSA with data from the lexical classifier, and generating an n-best list of target sentences from the FSA. As an alternate to using the FSA, N strings may be generated from the n-grams and ranked using a language model. The N strings may be represented by an FSA for efficiency but it is not necessary.

    摘要翻译: 公开了一种机器翻译方法,使用该方法的系统和计算机可读介质。 该方法包括以下步骤:接收源语言句,使用词法分类器并基于源语言句选择一组目标语言n-gram。 当选择一组目标语言n-gram时,在至少一个n-gram中,n大于1.该方法通过将所选择的一组目标语言n-gram组合为有限状态接收器(FSA)来继续加权, FSA与来自词汇分类器的数据,并从FSA生成目标句子的最佳列表。 作为使用FSA的替代方案,可以使用n-gram生成N个字符串,并使用语言模型进行排序。 N字符串可以由FSA表示以提高效率,但不是必需的。

    System and method for combining speech recognition outputs from a plurality of domain-specific speech recognizers via machine learning
    6.
    发明授权
    System and method for combining speech recognition outputs from a plurality of domain-specific speech recognizers via machine learning 有权
    用于通过机器学习来组合来自多个领域特定语音识别器的语音识别输出的系统和方法

    公开(公告)号:US08812321B2

    公开(公告)日:2014-08-19

    申请号:US12895359

    申请日:2010-09-30

    IPC分类号: G10L15/08 G10L15/32

    摘要: Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination.

    摘要翻译: 本文公开了用于在不需要模型定制或接收到的语音的领域的先前知识的情况下在不同的应用或环境上执行语音识别的系统,方法和非暂时的计算机可读介质。 该公开内容包括:利用特定领域的语音识别器的集合来识别接收的语音,为每个语音识别输出确定语音识别置信度,基于每个语音识别输出的相应的语音识别置信度选择语音识别候选,以及组合所选语音 识别候选人基于组合生成文本。

    System and method of generating responses to text-based messages
    8.
    发明授权
    System and method of generating responses to text-based messages 有权
    生成对基于文本的消息的响应的系统和方法

    公开(公告)号:US08296140B2

    公开(公告)日:2012-10-23

    申请号:US13300752

    申请日:2011-11-21

    IPC分类号: G10L15/00

    CPC分类号: G06F17/2785

    摘要: In accordance with one aspect of the present invention, an automated method of and system for generating a response to a text-based natural language message is disclosed. The method includes identifying a first selected input clause in a sentence in the text-based natural language message. Also, assigning a semantic tag to the first selected input clause and matching the semantic tag to a historical input tag. The historical input tag associated with a first previously generated response clause. Further; generating an output response message based on the historical response clause, the output response message derived from the historical input tag and a second previously generated response clause. The system includes means for performing the method steps.

    摘要翻译: 根据本发明的一个方面,公开了一种用于生成对基于文本的自然语言消息的响应的自动化方法和系统。 该方法包括识别基于文本的自然语言消息中的句子中的第一选择的输入子句。 此外,将语义标签分配给第一选择的输入子句并将语义标签与历史输入标签进行匹配。 与先前生成的第一个响应子句相关联的历史输入标签。 进一步; 基于历史响应子句生成输出响应消息,从历史输入标签导出的输出响应消息和第二个先前生成的响应子句。 该系统包括用于执行方法步骤的装置。

    SYSTEM AND METHOD OF SPOKEN LANGUAGE UNDERSTANDING IN HUMAN COMPUTER DIALOGS
    9.
    发明申请
    SYSTEM AND METHOD OF SPOKEN LANGUAGE UNDERSTANDING IN HUMAN COMPUTER DIALOGS 有权
    人类语言对话中语言语言理解的系统与方法

    公开(公告)号:US20120239383A1

    公开(公告)日:2012-09-20

    申请号:US13481031

    申请日:2012-05-25

    IPC分类号: G06F17/27 G10L15/00

    摘要: A system and method are disclosed that improve automatic speech recognition in a spoken dialog system. The method comprises partitioning speech recognizer output into self-contained clauses, identifying a dialog act in each of the self-contained clauses, qualifying dialog acts by identifying a current domain object and/or a current domain action, and determining whether further qualification is possible for the current domain object and/or current domain action. If further qualification is possible, then the method comprises identifying another domain action and/or another domain object associated with the current domain object and/or current domain action, reassigning the another domain action and/or another domain object as the current domain action and/or current domain object and then recursively qualifying the new current domain action and/or current object. This process continues until nothing is left to qualify.

    摘要翻译: 公开了一种提高口语对话系统中的自动语音识别的系统和方法。 该方法包括将语音识别器输出划分为独立子句,识别每个自包含子句中的对话行为,通过识别当前域对象和/或当前域动作进行限定对话行为,以及确定是否可进行进一步的限定 对于当前域对象和/或当前域操作。 如果可以进一步鉴定,则该方法包括识别与当前域对象和/或当前域操作相关联的另一域操作和/或另一域对象,将另一域操作和/或另一域对象重新分配为当前域操作,以及 /或当前域对象,然后递归地限定新的当前域操作和/或当前对象。 这个过程一直持续到没有什么是剩下的资格。

    SYSTEM AND METHOD FOR REFERRING TO ENTITIES IN A DISCOURSE DOMAIN
    10.
    发明申请
    SYSTEM AND METHOD FOR REFERRING TO ENTITIES IN A DISCOURSE DOMAIN 有权
    引导领域实体的系统和方法

    公开(公告)号:US20120221332A1

    公开(公告)日:2012-08-30

    申请号:US13465685

    申请日:2012-05-07

    IPC分类号: G10L15/26

    摘要: Systems, methods, and non-transitory computer-readable media for referring to entities. The method includes receiving domain-specific training data of sentences describing a target entity in a context, extracting a speaker history and a visual context from the training data, selecting attributes of the target entity based on at least one of the speaker history, the visual context, and speaker preferences, generating a text expression referring to the target entity based on at least one of the selected attributes, the speaker history, and the context, and outputting the generated text expression. The weighted finite-state automaton can represent partial orderings of word pairs in the domain-specific training data. The weighted finite-state automaton can be speaker specific or speaker independent. The weighted finite-state automaton can include a set of weighted partial orderings of the training data for each possible realization.

    摘要翻译: 用于引用实体的系统,方法和非暂时计算机可读介质。 该方法包括接收在上下文中描述目标实体的句子的特定领域的训练数据,从训练数据中提取讲者历史和视觉上下文,基于说话者的历史,视觉上的至少一个来选择目标实体的属性 上下文和说话人首选项,基于所选择的属性,说话者历史和上下文中的至少一个生成参考目标实体的文本表达,并输出所生成的文本表达。 加权有限状态自动机可以表示域特定训练数据中单词对的部分排序。 加权有限状态自动机可以是扬声器专用或扬声器独立的。 加权有限状态自动机可以包括用于每个可能实现的训练数据的一组加权部分排序。