Techniques for Inferring the Unknown Intents of Linguistic Items
    2.
    发明申请
    Techniques for Inferring the Unknown Intents of Linguistic Items 有权
    推测语言项目未知意图的技巧

    公开(公告)号:US20150227845A1

    公开(公告)日:2015-08-13

    申请号:US14180335

    申请日:2014-02-13

    IPC分类号: G06N5/04 G06N5/02 G06F17/27

    摘要: Functionality is described herein for determining the intents of linguistic items (such as queries), to produce intent output information. For some linguistic items, the functionality deterministically assigns intents to the linguistic items based on known intent labels, which, in turn, may be obtained or derived from a knowledge graph or other type of knowledge resource. For other linguistic items, the functionality infers the intents of the linguistic items based on selection log data (such as click log data provided by a search system). In some instances, the intent output information may reveal new intents that are not represented by the known intent labels. In one implementation, the functionality can use the intent output information to train a language understanding model.

    摘要翻译: 本文描述了用于确定语言项目(例如查询)的意图以产生意图输出信息的功能。 对于一些语言项目,功能确定性地基于已知的意图标签将意图​​分配给语言项目,反过来,可以从知识图或其他类型的知识资源获得或导出该意图标签。 对于其他语言项目,功能根据选择日志数据(如搜索系统提供的点击日志数据)来推断语言项目的意图。 在某些情况下,意图输出信息可能会显示未由已知意图标签表示的新意图。 在一个实现中,功能可以使用意图输出信息来训练语言理解模型。

    Language Modeling For Conversational Understanding Domains Using Semantic Web Resources
    3.
    发明申请
    Language Modeling For Conversational Understanding Domains Using Semantic Web Resources 有权
    使用语义网络资源的会话理解域的语言建模

    公开(公告)号:US20150332670A1

    公开(公告)日:2015-11-19

    申请号:US14278659

    申请日:2014-05-15

    IPC分类号: G10L15/06 G10L15/18 G06F17/28

    摘要: Systems and methods are provided for training language models using in-domain-like data collected automatically from one or more data sources. The data sources (such as text data or user-interactional data) are mined for specific types of data, including data related to style, content, and probability of relevance, which are then used for language model training. In one embodiment, a language model is trained from features extracted from a knowledge graph modified into a probabilistic graph, where entity popularities are represented and the popularity information is obtained from data sources related to the knowledge. Embodiments of language models trained from this data are particularly suitable for domain-specific conversational understanding tasks where natural language is used, such as user interaction with a game console or a personal assistant application on personal device.

    摘要翻译: 提供了系统和方法,用于使用从一个或多个数据源自动收集的类似域的数据来训练语言模型。 为特定类型的数据挖掘数据源(如文本数据或用户交互数据),包括与风格,内容和相关概率相关的数据,然后将其用于语言模型培训。 在一个实施例中,从从修改为概率图的知识图中提取的特征来训练语言模型,其中表示实体流行度,并且从与知识相关的数据源获得流行度信息。 从该数据训练的语言模型的实施例特别适用于使用自然语言的领域特定对话理解任务,例如用户与个人设备上的游戏控制台或个人助理应用程序的交互。

    Discriminating Between Natural Language and Keyword Language Items
    4.
    发明申请
    Discriminating Between Natural Language and Keyword Language Items 有权
    自然语言与关键词语言项目之间的歧视

    公开(公告)号:US20150161107A1

    公开(公告)日:2015-06-11

    申请号:US14155097

    申请日:2014-01-14

    IPC分类号: G06F17/28

    摘要: This disclosure pertains to a classification model, and to functionality for producing and applying the classification model. The classification model is configured to discriminate whether an input linguistic item (such as a query) corresponding to either a natural language (NL) linguistic item or a keyword language (KL) linguistic item. An NL linguistic item expresses an intent using a natural language, while a KL linguistic item expresses the intent using one or more keywords. In a training phase, the functionality produces the classification model based on query click log data or the like. In an application phase, the functionality may, among other uses, use the classification model to filter a subset of NL linguistic items from a larger set of items, and then use the subset of NL linguistic items to train a natural language interpretation model, such as a spoken language understanding model.

    摘要翻译: 本公开涉及分类模型,以及生产和应用分类模型的功能。 分类模型被配置为区分对应于自然语言(NL)语言项目或关键字语言(KL)语言项目的输入语言项目(诸如查询)。 NL语言项目使用自然语言表达意图,而KL语言项目使用一个或多个关键字表达意图。 在训练阶段,功能基于查询点击日志数据等产生分类模型。 在应用阶段,除了其他用途之外,功能可以使用分类模型从较大的一组项目中过滤NL语言项目的子集,然后使用NL语言项目的子集训练自然语言解释模型,例如 作为口语理解模型。

    Discriminating between natural language and keyword language items
    5.
    发明授权
    Discriminating between natural language and keyword language items 有权
    区分自然语言和关键词语言项目

    公开(公告)号:US09558176B2

    公开(公告)日:2017-01-31

    申请号:US14155097

    申请日:2014-01-14

    IPC分类号: G06F17/27

    摘要: This disclosure pertains to a classification model, and to functionality for producing and applying the classification model. The classification model is configured to discriminate whether an input linguistic item (such as a query) corresponding to either a natural language (NL) linguistic item or a keyword language (KL) linguistic item. An NL linguistic item expresses an intent using a natural language, while a KL linguistic item expresses the intent using one or more keywords. In a training phase, the functionality produces the classification model based on query click log data or the like. In an application phase, the functionality may, among other uses, use the classification model to filter a subset of NL linguistic items from a larger set of items, and then use the subset of NL linguistic items to train a natural language interpretation model, such as a spoken language understanding model.

    摘要翻译: 本公开涉及分类模型,以及生产和应用分类模型的功能。 分类模型被配置为区分对应于自然语言(NL)语言项目或关键字语言(KL)语言项目的输入语言项目(诸如查询)。 NL语言项目使用自然语言表达意图,而KL语言项目使用一个或多个关键字表达意图。 在训练阶段,功能基于查询点击日志数据等产生分类模型。 在应用阶段,除了其他用途之外,功能可以使用分类模型从较大的一组项目中过滤NL语言项目的子集,然后使用NL语言项目的子集训练自然语言解释模型,例如 作为口语理解模型。

    Deep structured semantic model produced using click-through data
    6.
    发明授权
    Deep structured semantic model produced using click-through data 有权
    使用点击型数据生成的深层结构语义模型

    公开(公告)号:US09519859B2

    公开(公告)日:2016-12-13

    申请号:US14019563

    申请日:2013-09-06

    摘要: A deep structured semantic module (DSSM) is described herein which uses a model that is discriminatively trained based on click-through data, e.g., such that a conditional likelihood of clicked documents, given respective queries, is maximized, and a condition likelihood of non-clicked documents, given the queries, is reduced. In operation, after training is complete, the DSSM maps an input item into an output item expressed in a semantic space, using the trained model. To facilitate training and runtime operation, a dimensionality-reduction module (DRM) can reduce the dimensionality of the input item that is fed to the DSSM. A search engine may use the above-summarized functionality to convert a query and a plurality of documents into the common semantic space, and then determine the similarity between the query and documents in the semantic space. The search engine may then rank the documents based, at least in part, on the similarity measures.

    摘要翻译: 本文描述了一种深层结构化语义模块(DSSM),其使用基于点击数据进行区分性训练的模型,例如,使得给定相应查询的点击文档的条件可能性最大化,并且条件可能性为非 给定查询的文件被缩小。 在操作中,训练完成后,DSSM使用训练模型将输入项映射到语义空间中表达的输出项。 为了便于训练和运行时操作,维度降低模块(DRM)可以降低馈送到DSSM的输入项的维度。 搜索引擎可以使用上述功能将查询和多个文档转换成公共语义空间,然后确定语义空间中的查询和文档之间的相似性。 搜索引擎可以至少部分地基于相似性度量对文档进行排序。

    Knowledge Source Personalization To Improve Language Models
    7.
    发明申请
    Knowledge Source Personalization To Improve Language Models 有权
    知识源个性化来改善语言模型

    公开(公告)号:US20150332672A1

    公开(公告)日:2015-11-19

    申请号:US14280070

    申请日:2014-05-16

    IPC分类号: G10L15/18

    摘要: Systems and methods are provided for improving language models for speech recognition by personalizing knowledge sources utilized by the language models to specific users or user-population characteristics. A knowledge source, such as a knowledge graph, is personalized for a particular user by mapping entities or user actions from usage history for the user, such as query logs, to the knowledge source. The personalized knowledge source may be used to build a personal language model by training a language model with queries corresponding to entities or entity pairs that appear in usage history. In some embodiments, a personalized knowledge source for a specific user can be extended based on personalized knowledge sources of similar users.

    摘要翻译: 提供了系统和方法,用于通过将语言模型所使用的知识源个性化为特定用户或用户群体特征来改进用于语音识别的语言模型。 通过将实体或用户操作与用户的使用历史(例如查询日志)映射到知识源,为特定用户个性化知识源。 个性化知识源可以用于通过训练具有与出现在使用历史中的实体或实体对相对应的查询的语言模型来构建个人语言模型。 在一些实施例中,可以基于类似用户的个性化知识源来扩展用于特定用户的个性化知识源。

    Techniques for inferring the unknown intents of linguistic items

    公开(公告)号:US09870356B2

    公开(公告)日:2018-01-16

    申请号:US14180335

    申请日:2014-02-13

    IPC分类号: G06N5/00 G06F17/27

    摘要: Functionality is described herein for determining the intents of linguistic items (such as queries), to produce intent output information. For some linguistic items, the functionality deterministically assigns intents to the linguistic items based on known intent labels, which, in turn, may be obtained or derived from a knowledge graph or other type of knowledge resource. For other linguistic items, the functionality infers the intents of the linguistic items based on selection log data (such as click log data provided by a search system). In some instances, the intent output information may reveal new intents that are not represented by the known intent labels. In one implementation, the functionality can use the intent output information to train a language understanding model.

    Session Context Modeling For Conversational Understanding Systems
    10.
    发明申请
    Session Context Modeling For Conversational Understanding Systems 审中-公开
    对话理解系统的会话背景建模

    公开(公告)号:US20150370787A1

    公开(公告)日:2015-12-24

    申请号:US14308174

    申请日:2014-06-18

    IPC分类号: G06F17/28

    摘要: Systems and methods are provided for improving language models for speech recognition by adapting knowledge sources utilized by the language models to session contexts. A knowledge source, such as a knowledge graph, is used to capture and model dynamic session context based on user interaction information from usage history, such as session logs, that is mapped to the knowledge source. From sequences of user interactions, higher level intent sequences may be determined and used to form models that anticipate similar intents but with different arguments including arguments that do not necessarily appear in the usage history. In this way, the session context models may be used to determine likely next interactions or “turns” from a user, given a previous turn or turns. Language models corresponding to the likely next turns are then interpolated and provided to improve recognition accuracy of the next turn received from the user.

    摘要翻译: 提供了系统和方法,用于通过将语言模型所使用的知识源适应于会话环境来改进用于语音识别的语言模型。 诸如知识图的知识源被用于基于来自映射到知识源的使用历史(例如会话日志)的用户交互信息来捕获和建模动态会话上下文。 根据用户交互序列,可以确定较高级别的意图序列,并用于形成预期相似意图但具有不同参数的模型,包括不一定出现在使用历史中的参数。 以这种方式,会话上下文模型可以用于确定来自用户的可能的下一个交互或“转弯”,给定先前的转弯或转弯。 然后内插并提供与可能的下一匝对应的语言模型,以提高从用户接收的下一匝的识别精度。