摘要:
Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.
摘要:
Machines, systems and methods for maintaining a representative data set in a document classification system, the method comprising: including an initial set of seed representative data in a representative data set (RDS) implemented for a knowledge base (KB), wherein the KB is trained to classify documents provided to a document classification system based on analysis of the representative documents included in the RDS and a set of rules, wherein the seed representative data includes a balanced number of representative data across a plurality of classes; updating the RDS by adding or removing representative data from the RDS based on feedback received about accuracy of classification of one or more documents by the classification system; and retraining the KB, wherein the retraining is performed based on occurrence of one or more events.
摘要:
Techniques for analyzing a repository are described herein. A method for analyzing a repository may include obtaining a list of known persons in a repository based on objects, users, and groups retrieved from the repository. The method may further select one of the objects having a field and a value, and then determine whether the field of the selected object is a facet based on a probability that the field of the selected object has a limited number of possible values. In analyzing the repository, a repository information archive may be generated. The repository information archive may include the relationship between the selected object and at least one other object, statistics and counts related to properties in the selected objects, and whether or not the field of the selected object is a facet.
摘要:
Machines, systems and methods for maintaining a representative data set in a document classification system, the method comprising: including an initial set of seed representative data in a representative data set (RDS) implemented for a knowledge base (KB), wherein the KB is trained to classify documents provided to a document classification system based on analysis of the representative documents included in the RDS and a set of rules, wherein the seed representative data includes a balanced number of representative data across a plurality of classes; updating the RDS by adding or removing representative data from the RDS based on feedback received about accuracy of classification of one or more documents by the classification system; and retraining the KB, wherein the retraining is performed based on occurrence of one or more events.
摘要:
Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.
摘要:
Techniques for processing a speech to text query are described herein. The techniques may include receiving a plurality of speech to text translation alternatives for a phrase of a natural language query, and tagging and parsing each of the translation alternatives based on a static analysis of the known domain that is at least partially structured, known tags of the known domain, and custom rules. The techniques may also include ranking the translation alternatives based on the tagging and parsing and translating the phrase based on the ranking.
摘要:
Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.
摘要:
Techniques for analyzing a repository are described herein. A method for analyzing a repository may include obtaining a list of known persons in a repository based on objects, users, and groups retrieved from the repository. The method may further select one of the objects having a field and a value, and then determine whether the field of the selected object is a facet based on a probability that the field of the selected object has a limited number of possible values. In analyzing the repository, a repository information archive may be generated. The repository information archive may include the relationship between the selected object and at least one other object, statistics and counts related to properties in the selected objects, and whether or not the field of the selected object is a facet.
摘要:
A computer-implemented method, system and computer program product for maintaining a target accuracy level. A target accuracy level is received. Thresholds including ongoing adjustable automation thresholds for categories are computed based on the target accuracy level. Data is received and a classification score for the categories is generated with respect to the data based on a category knowledgebase. Furthermore, a classification score is detected for a category with a higher classification score than other categories of the plurality of categories that exceeds an ongoing adjustable automation threshold. A reply to the data is automatically sent out based on the category with the higher classification score. The action, the suggestion list, and corresponding received feedback are monitored to generate a historical performance dataset. An actual accuracy level is then determined based on the historical performance dataset. The ongoing adjustable automation threshold is then adjusted based on the actual accuracy level.
摘要:
Techniques for generating query language statements for a document repository are described herein. An example method includes detecting a search query corresponding to a document repository and generating a modified search query by adding atomic tags to the search query, the atomic tags being based on prior knowledge obtained by static analysis of the document repository and semantic rules. The method also includes generating enriched tags based on combinations of the atomic tags and any previously identified enriched tags and generating a first set of conditions based on combinations of the atomic tags and the generated enriched tags and generating a second set of conditions based on free-text conditions. The method also includes generating the query language statements based on the first set of conditions and the second set of conditions and displaying a plurality of documents from the document repository that satisfy the query language statements.