Generating and executing query language statements from natural language

    公开(公告)号:US10180989B2

    公开(公告)日:2019-01-15

    申请号:US14808138

    申请日:2015-07-24

    IPC分类号: G06F17/30

    摘要: Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.

    Auto-maintained document classification
    2.
    发明授权
    Auto-maintained document classification 有权
    自动维护的文档分类

    公开(公告)号:US09195947B2

    公开(公告)日:2015-11-24

    申请号:US14492914

    申请日:2014-09-22

    IPC分类号: G06N5/04 G06K9/62 G06N99/00

    摘要: Machines, systems and methods for maintaining a representative data set in a document classification system, the method comprising: including an initial set of seed representative data in a representative data set (RDS) implemented for a knowledge base (KB), wherein the KB is trained to classify documents provided to a document classification system based on analysis of the representative documents included in the RDS and a set of rules, wherein the seed representative data includes a balanced number of representative data across a plurality of classes; updating the RDS by adding or removing representative data from the RDS based on feedback received about accuracy of classification of one or more documents by the classification system; and retraining the KB, wherein the retraining is performed based on occurrence of one or more events.

    摘要翻译: 用于在文档分类系统中维护代表性数据集的机器,系统和方法,所述方法包括:在针对知识库(KB)实现的代表性数据集(RDS)中包括初始集合种子代表数据,其中所述知识库是 根据对包括在RDS中的代表性文件的分析以及一组规则来对提供给文档分类系统的文档进行分类,其中种子代表数据包括跨多个类别的平均数量的代表性数据; 根据收到的关于分类系统对一个或多个文件的分类准确性的反馈,从RDS中添加或删除代表性数据来更新RDS; 并重新训练KB,其中基于一个或多个事件的发生执行再训练。

    Automatic analysis of repository structure to facilitate natural language queries

    公开(公告)号:US10242009B2

    公开(公告)日:2019-03-26

    申请号:US15139885

    申请日:2016-04-27

    IPC分类号: G06F17/30

    摘要: Techniques for analyzing a repository are described herein. A method for analyzing a repository may include obtaining a list of known persons in a repository based on objects, users, and groups retrieved from the repository. The method may further select one of the objects having a field and a value, and then determine whether the field of the selected object is a facet based on a probability that the field of the selected object has a limited number of possible values. In analyzing the repository, a repository information archive may be generated. The repository information archive may include the relationship between the selected object and at least one other object, statistics and counts related to properties in the selected objects, and whether or not the field of the selected object is a facet.

    AUTO-MAINTAINED DOCUMENT CLASSIFICATION
    4.
    发明申请
    AUTO-MAINTAINED DOCUMENT CLASSIFICATION 有权
    自动维护的文档分类

    公开(公告)号:US20150012470A1

    公开(公告)日:2015-01-08

    申请号:US14492914

    申请日:2014-09-22

    IPC分类号: G06N5/04 G06N99/00

    摘要: Machines, systems and methods for maintaining a representative data set in a document classification system, the method comprising: including an initial set of seed representative data in a representative data set (RDS) implemented for a knowledge base (KB), wherein the KB is trained to classify documents provided to a document classification system based on analysis of the representative documents included in the RDS and a set of rules, wherein the seed representative data includes a balanced number of representative data across a plurality of classes; updating the RDS by adding or removing representative data from the RDS based on feedback received about accuracy of classification of one or more documents by the classification system; and retraining the KB, wherein the retraining is performed based on occurrence of one or more events.

    摘要翻译: 用于在文档分类系统中维护代表性数据集的机器,系统和方法,所述方法包括:在针对知识库(KB)实现的代表性数据集(RDS)中包括初始集合种子代表数据,其中所述知识库是 根据对包括在RDS中的代表性文件的分析以及一组规则来对提供给文档分类系统的文档进行分类,其中种子代表数据包括跨多个类别的平均数量的代表性数据; 根据收到的关于分类系统对一个或多个文件的分类准确性的反馈,从RDS中添加或删除代表性数据来更新RDS; 并重新训练KB,其中基于一个或多个事件的发生执行再训练。

    Generating and executing query language statements from natural language

    公开(公告)号:US10169471B2

    公开(公告)日:2019-01-01

    申请号:US15140839

    申请日:2016-04-28

    IPC分类号: G06F17/30

    摘要: Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.

    GENERATING AND EXECUTING QUERY LANGUAGE STATEMENTS FROM NATURAL LANGUAGE
    7.
    发明申请
    GENERATING AND EXECUTING QUERY LANGUAGE STATEMENTS FROM NATURAL LANGUAGE 审中-公开
    从自然语言生成和执行查询语言语句

    公开(公告)号:US20170024431A1

    公开(公告)日:2017-01-26

    申请号:US15140839

    申请日:2016-04-28

    IPC分类号: G06F17/30

    摘要: Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.

    摘要翻译: 本文描述了用于为文档存储库生成查询语言语句的技术。 示例性方法包括检测对应于文档库的搜索查询,并通过向搜索查询添加原子标签来生成修改的搜索查询,所述原子标签基于通过文档库的静态分析获得的先验知识和语义规则。 该方法还包括基于原子标签和任何先前识别的富集标签的组合产生富集的标签,并且基于原子标签和所生成的富集标签的组合产生第一组条件,并且基于自由标签生成第二组条件, 文字条件。 该方法还包括基于第一组条件和第二组条件生成查询语言语句,并从文档库中显示满足查询语言语句的多个文档。

    AUTOMATIC ANALYSIS OF REPOSITORY STRUCTURE TO FACILITATE NATURAL LANGUAGE QUERIES
    8.
    发明申请
    AUTOMATIC ANALYSIS OF REPOSITORY STRUCTURE TO FACILITATE NATURAL LANGUAGE QUERIES 审中-公开
    报告结构自动化分析自动语言查询

    公开(公告)号:US20170011047A1

    公开(公告)日:2017-01-12

    申请号:US14791796

    申请日:2015-07-06

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30073

    摘要: Techniques for analyzing a repository are described herein. A method for analyzing a repository may include obtaining a list of known persons in a repository based on objects, users, and groups retrieved from the repository. The method may further select one of the objects having a field and a value, and then determine whether the field of the selected object is a facet based on a probability that the field of the selected object has a limited number of possible values. In analyzing the repository, a repository information archive may be generated. The repository information archive may include the relationship between the selected object and at least one other object, statistics and counts related to properties in the selected objects, and whether or not the field of the selected object is a facet.

    摘要翻译: 本文描述了用于分析存储库的技术。 用于分析存储库的方法可以包括基于从存储库检索的对象,用户和组获得存储库中的已知人员的列表。 该方法还可以选择具有字段和值的对象之一,然后基于所选对象的字段具有有限数量的可能值的概率来确定所选对象的字段是否是面。 在分析存储库时,可能会生成存储库信息存档。 存储库信息归档可以包括所选对象和至少一个其他对象之间的关系,与所选对象中的属性相关的统计和计数,以及所选对象的字段是否是面。

    RUNTIME CONTROL OF AUTOMATION ACCURACY USING ADJUSTABLE THRESHOLDS

    公开(公告)号:US20190102452A1

    公开(公告)日:2019-04-04

    申请号:US15723400

    申请日:2017-10-03

    IPC分类号: G06F17/30 G06N99/00

    摘要: A computer-implemented method, system and computer program product for maintaining a target accuracy level. A target accuracy level is received. Thresholds including ongoing adjustable automation thresholds for categories are computed based on the target accuracy level. Data is received and a classification score for the categories is generated with respect to the data based on a category knowledgebase. Furthermore, a classification score is detected for a category with a higher classification score than other categories of the plurality of categories that exceeds an ongoing adjustable automation threshold. A reply to the data is automatically sent out based on the category with the higher classification score. The action, the suggestion list, and corresponding received feedback are monitored to generate a historical performance dataset. An actual accuracy level is then determined based on the historical performance dataset. The ongoing adjustable automation threshold is then adjusted based on the actual accuracy level.

    GENERATING AND EXECUTING QUERY LANGUAGE STATEMENTS FROM NATURAL LANGUAGE

    公开(公告)号:US20170024443A1

    公开(公告)日:2017-01-26

    申请号:US14808138

    申请日:2015-07-24

    IPC分类号: G06F17/30

    摘要: Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.