Abstract:
According to an aspect, storing and querying conceptual indices (CIs) includes creating a conceptual inverted index (CII) from the CIs. The CII includes CII entries, each of which corresponds to a concept in a concept graph. Creating the CII includes populating each entry with pointers to documents selected from the CIs having likelihoods of being related to the concept that are greater than a threshold value, and the corresponding likelihoods. An aspect also includes receiving a query that includes a concept in the concept graph, and generating query results from a search that include the row at least a subset of the pointers to documents. Each of the CIs is associated with a corresponding document and includes a CI entry for each concept in the concept graph, and each of the CI entries specifies a value indicating a likelihood that the document is related to the concept in the concept graph.
Abstract:
According to an aspect, summarizing relevance of a document to a conceptual query includes receiving the conceptual query, accessing concepts extracted from the document, and computing a degree to which the conceptual query is related to each of the extracted concepts. The computing is responsive to a metric that measures a relevance between the concepts in the conceptual query and the extracted concepts. The method also includes creating a summary by selecting a threshold number of the concepts having a greatest degree of relation to the conceptual query, and outputting the summary including the selected threshold number of concepts.
Abstract:
According to an aspect, conceptual analysis of a document includes accessing a concept graph that includes a plurality of nodes and edges. Each node represents a concept and each edge represents a known relation between two concepts. Conceptual analysis of the document further includes computing a relevance of the document to concepts in the concept graph. The computing includes receiving a priori information about the document including concepts extracted from the document. The concepts extracted from the document include a subset of the concepts in the concept graph. The computing also includes combining the a priori information and the concept graph to generate a posteriori information that indicates a likelihood that the document is related to each of the concepts in the concept graph.
Abstract:
Mechanisms, in a system comprising a host system and at least one accelerator device, for performing a concept analysis operation are provided. The host system extracts a set of one or more concepts from an information source and provides the set of one or more concepts to the accelerator device. The host system also provides at least one matrix representation data structure representing a graph of concepts and relationships between concepts in a corpus. The accelerator device executes the concept analysis operation internal to the accelerator device to generate an output vector identifying concepts in the corpus, identified in the at least one matrix representation data structure, related to the set of one or more concepts extracted from the information source. The accelerator device outputs the output vector to the host system which utilizes the output vector to respond to a request submitted to the host system associated with the information source.
Abstract:
According to an aspect, searching, recommending, and exploring documents through conceptual associations includes a method for receiving a plurality of documents and extracting concepts from each of the documents. A degree of relation between each of the documents and concepts in a knowledge base is calculated. The method also includes, in response to receiving a query, determining one or more concepts from the query. For each of the concepts, a list of documents having a highest degree of relation to the concept is retrieved. The method also includes outputting a list that is responsive to the one or more retrieved lists.
Abstract:
According to an aspect, summarizing relevance of a document to a conceptual query includes receiving the conceptual query, accessing concepts extracted from the document, and computing a degree to which the conceptual query is related to each of the extracted concepts. The computing is responsive to a metric that measures a relevance between the concepts in the conceptual query and the extracted concepts. An aspect also includes creating a summary by selecting a threshold number of the concepts having a greatest degree of relation to the conceptual query, and outputting the summary including the selected threshold number of concepts.
Abstract:
Mechanisms, in a system comprising a host system and at least one accelerator device, for performing a concept analysis operation are provided. The host system extracts a set of one or more concepts from an information source and provides the set of one or more concepts to the accelerator device. The host system also provides at least one matrix representation data structure representing a graph of concepts and relationships between concepts in a corpus. The accelerator device executes the concept analysis operation internal to the accelerator device to generate an output vector identifying concepts in the corpus, identified in the at least one matrix representation data structure, related to the set of one or more concepts extracted from the information source. The accelerator device outputs the output vector to the host system which utilizes the output vector to respond to a request submitted to the host system associated with the information source.
Abstract:
According to an aspect, automatically linking text to concepts in a knowledge base using differential analysis includes receiving a text string and selecting, based on contents of the text string, a plurality of data sources that correspond to concepts in the knowledge base. In a further aspect, automatically linking the text to the concepts includes calculating, for each of the selected data sources, a probability that the text string is output by a language model built using the selected data source, calculating a probability that the text string is output by a generic language model, calculating link confidence scores for each concept based on a differential analysis of the probabilities, and creating a link from the text string to one of the concepts in the knowledge base. The creating is based on a link confidence score of the concept being more than a threshold value away from a prescribed threshold.
Abstract:
According to an aspect, conceptual analysis of a document includes accessing a concept graph that includes a plurality of nodes and edges. Each node represents a concept and each edge represents a known relation between two concepts. Conceptual analysis of the document further includes computing a relevance of the document to concepts in the concept graph. The computing includes receiving a priori information about the document including concepts extracted from the document. The concepts extracted from the document include a subset of the concepts in the concept graph. The computing also includes combining the a priori information and the concept graph to generate a posteriori information that indicates a likelihood that the document is related to each of the concepts in the concept graph.
Abstract:
Mechanisms are provided for performing a matrix operation. A processor of a data processing system is configured to perform cluster-based matrix reordering of an input matrix. An input matrix, which comprises nodes associated with elements of the matrix, is received. The nodes are clustered into clusters based on numbers of connections with other nodes within and between the clusters, and the clusters are ordered by minimizing a total length of cross cluster connections between nodes of the clusters, to thereby generate a reordered matrix. A lookup table is generated identifying new locations of nodes of the input matrix, in the reordered matrix. A matrix operation is then performed based on the reordered matrix and the lookup table.