摘要:
A semantic embedding model using geometrical set-centric approach to capture both ABox and TBox representational models is disclosed. The model transforms a semantic-rich knowledge graph into a set of overlapping, disjoint, and/or subsumed n-dimensional spheres that captures and represents semantics embedded in the knowledge graph.
摘要:
A system for identifying a schema for storing graph data includes a database containing a graph dataset of data and relationships between data pairs and a list of storage methods that each are a distinct structural arrangement of the data and relationships from the graph data set. An analyzer module collects statistics for the graph dataset, and a data classification module uses the collected statistics to calculate metrics describing the data and relationships in the graph dataset, uses the calculated metrics to group the data and relationships into a plurality of graph dataset subsets and associates each graph dataset subset with one of the plurality of storage methods. The resulting group of storage methods associated with the plurality of graph dataset subsets includes a unique storage method for each graph dataset subset. The data and relationships in each graph dataset subset are arranged in accordance with associated storage methods.
摘要:
A system for storing graph data as a multi-dimensional cluster having a database with a graph dataset containing data and relationships between data pairs and a schema list of storage methods that use a table with columns and rows associated with data or relationships. An analyzer module to collect statistics of a graph dataset and a dimension identification module to identify a plurality of dimensions that each represent a column in the table. A schema creation and loading module creates a modified storage method and having a plurality of distinct table blocks and a plurality of table block indexes, one index for each table block and arranges the data and relationships in the given graph dataset in accordance with the modified storage method to create the multi-dimensional cluster.
摘要:
Generate, from a logical formula, a directed acyclic graph having a plurality of nodes and a plurality of edges. Assign an initial embedding to each mode and edge, to one of a plurality of layers. Compute a plurality of initial node states by using feed-forward networks, and construct cross-dependent embeddings between conjecture node embeddings and premise node embeddings. Topologically sort the DAG with the initial embeddings and node states. Beginning from a lowest rank, compute layer-by-layer embedding updates for each of the plurality of layers until a root is reached. Assign the embedding update for the root node as a final embedding for the DAG. Provide the final embedding for the DAG as input to a machine learning system, and carry out the automatic theorem proving with same.
摘要:
Embodiments of the present invention are directed to a computer-implemented method for generating a framework for analyzing adverse drug reactions. A non-limiting example of the computer-implemented method includes receiving to a processor, a plurality of drug chemical structures. The non-limiting example also includes receiving, to the processor, a plurality of known drug-adverse drug reaction associations. The non-limiting example also includes constructing, by the processor, a deep learning framework for each of a plurality of adverse drug reactions based at least in part upon the plurality of drug chemical structures and the plurality of known adverse-drug reaction associations.
摘要:
Various embodiments of the invention relate to optimizing storage of schema-less data. At least one of a schema-less dataset including a plurality of resources one or more query workloads associated with the plurality of resources is received. Each resource is associated with at least a plurality of properties. At least one set of co-occurring properties from the plurality of properties is identified. A graph including a plurality of nodes is generated. Each of the nodes represents a unique property in the set of co-occurring properties. The graph further includes an edge connecting each node representing a pair of co-occurring properties. A schema is generated based on the graph that assigns a column identifier from a table to each unique property represented by one of the nodes in the graph.
摘要:
A schema for a dataset is identified by identifying a dataset comprising data and relationships between data pairs. An original schema is identified for the dataset. This original schema comprises an organizational structure. An initial fit between the dataset and the original schema is determined. The initial fit quantifying a conformity of the data in the dataset to the organizational structure of the original schema. A plurality of additional schemas are identified. Each additional schema is a distinct organizational schema. The dataset is partitioned into a plurality of subsets. Each subset comprises a modified fit quantifying a modified conformity of subset data in each subset to one of the original schema and the additional schemas. The modified fit is greater than the original fit.
摘要:
A method of improving computing efficiency of a computing device for language-independent problem solving and reasoning includes receiving a query from a user, which is decomposed into one or more sub-queries arranged according to a tree structure. The one or more sub-queries are executed in a knowledge base. The results of the executed one or more sub-queries are received and composed into a query response. The query response is transmitted to the user.
摘要:
One embodiment of the invention provides a method for natural language processing (NLP). The method comprises extracting knowledge outside of text content of a NLP instance by extracting a set of subgraphs from a knowledge graph associated with the text content. The set of subgraphs comprises the knowledge. The method further comprises encoding the knowledge with the text content into a fixed size graph representation by filtering and encoding the set of subgraphs. The method further comprises applying a text embedding algorithm to the text content to generate a fixed size text representation, and classifying the text content based on the fixed size graph representation and the fixed size text representation.
摘要:
A system for storing graph data as a multi-dimensional cluster having a database with a graph dataset containing data and relationships between data pairs and a schema list of storage methods that use a table with columns and rows associated with data or relationships. An analyzer module to collect statistics of a graph dataset and a dimension identification module to identify a plurality of dimensions that each represent a column in the table. A schema creation and loading module creates a modified storage method and having a plurality of distinct table blocks and a plurality of table block indexes, one index for each table block and arranges the data and relationships in the given graph dataset in accordance with the modified storage method to create the multi-dimensional cluster.