Efficient, in-memory, relational representation for heterogeneous graphs

    公开(公告)号:US11120082B2

    公开(公告)日:2021-09-14

    申请号:US15956115

    申请日:2018-04-18

    Abstract: Techniques are provided herein for efficient representation of heterogeneous graphs in memory. In an embodiment, vertices and edges of the graph are segregated by type. Each property of a type of vertex or edge has values stored in a respective vector. Directed or undirected edges of a same type are stored in compressed sparse row (CSR) format. The CSR format is more or less repeated for edge traversal in either forward or reverse direction. An edge map translates edge offsets obtained from traversal in the reverse direction for use with data structures that expect edge offsets in the forward direction. Subsequent filtration and/or traversal by type or property of vertex or edge entails minimal data access and maximal data locality, thereby increasing efficient use of the graph.

    Deterministic semantic for graph property update queries and its efficient implementation

    公开(公告)号:US11928097B2

    公开(公告)日:2024-03-12

    申请号:US17479006

    申请日:2021-09-20

    CPC classification number: G06F16/2315 G06F11/0772 G06F16/2365

    Abstract: Efficiently implemented herein is a deterministic semantic for property updates by graph queries. Mechanisms of determinism herein ensure data consistency for graph mutation. These mechanisms facilitate optimistic execution of graph access despite a potential data access conflict. This approach may include various combinations of special activities such as detecting potential conflicts during query compile time, applying query transformations to eliminate those conflicts during code generation where possible, and executing updates in an optimistic way that safely fails if determinism cannot be guaranteed. In an embodiment, a computer receives a request to modify a graph. The request to modify the graph is optimistically executed after preparation and according to safety precautions as presented herein. Based on optimistically executing the request, a data access conflict actually occurs and is automatically detected. Based on the data access conflict, optimistically executing the request is prematurely and automatically halted without finishing executing the request.

    EFFICIENT GRAPH QUERY EXECUTION ENGINE SUPPORTING GRAPHS WITH MULTIPLE VERTEX AND EDGE TYPES

    公开(公告)号:US20200265090A1

    公开(公告)日:2020-08-20

    申请号:US16280591

    申请日:2019-02-20

    Abstract: Herein are computerized techniques for processing a heterogeneous graph according to scan-avoidant query planning. In an embodiment, a computer respectively stores a first and second kind of vertices of a property graph into a first and second vertex tables. The computer generates, without scanning the second vertex table: a) an initial partial result of a query of the property graph based on the first vertex table, and b) a subsequent partial result of the query based on the initial partial result and the second kind of vertices. Herein are graph encodings that are dense, without requiring extra computation, and that exploit graph heterogeneity to achieve an aggregation granularity that reduces data working set scope, optimizes for caching, and encourages compression. Herein are query execution mechanisms and techniques that intelligently avoid accessing circumstantially extraneous data and/or structures and that can horizontally scale.

    DETERMINISTIC SEMANTIC FOR GRAPH PROPERTY UPDATE QUERIES AND ITS EFFICIENT IMPLEMENTATION

    公开(公告)号:US20230095703A1

    公开(公告)日:2023-03-30

    申请号:US17479006

    申请日:2021-09-20

    Abstract: Efficiently implemented herein is a deterministic semantic for property updates by graph queries. Mechanisms of determinism herein ensure data consistency for graph mutation. These mechanisms facilitate optimistic execution of graph access despite a potential data access conflict. This approach may include various combinations of special activities such as detecting potential conflicts during query compile time, applying query transformations to eliminate those conflicts during code generation where possible, and executing updates in an optimistic way that safely fails if determinism cannot be guaranteed. In an embodiment, a computer receives a request to modify a graph. The request to modify the graph is optimistically executed after preparation and according to safety precautions as presented herein. Based on optimistically executing the request, a data access conflict actually occurs and is automatically detected. Based on the data access conflict, optimistically executing the request is prematurely and automatically halted without finishing executing the request.

    Learning property graph representations edge-by-edge

    公开(公告)号:US11205050B2

    公开(公告)日:2021-12-21

    申请号:US16179049

    申请日:2018-11-02

    Abstract: Techniques are described herein for learning property graph representations edge-by-edge. In an embodiment, an input graph is received. The input graph comprises a plurality of vertices and a plurality of edges. Each vertex of the plurality of vertices is associated with vertex properties of the respective vertex. A vertex-to-property mapping is generated for each vertex of the plurality of vertices. The mapping maps each vertex to a vertex-property signature of a plurality of vertex-property signatures. A plurality of edge words is generated. Each edge word corresponds to one or more edges that each begin at a first vertex having a particular vertex-property signature of the plurality of vertex property signatures and end at a second vertex having a particular vertex-property signature of the plurality of vertex property signatures. A plurality of sentences is generated. Each sentence comprises edge words directly connected along a path of a plurality of paths in the input graph. Using the plurality of sentences and the plurality of edge words, a document vectorization model is used to generate machine learning vectors that represent the input graph.

    CATEGORICAL FEATURE ENCODING FOR PROPERTY GRAPHS BY VERTEX PROXIMITY

    公开(公告)号:US20200257982A1

    公开(公告)日:2020-08-13

    申请号:US16270535

    申请日:2019-02-07

    Abstract: Techniques are described herein for encoding categorical features of property graphs by vertex proximity. In an embodiment, an input graph is received. The input graph comprises a plurality of vertices, each vertex of said plurality of vertices is associated with vertex properties of said vertex. The vertex properties include at least one categorical feature value of one or more potential categorical feature values. For each of the one or more potential categorical feature values of each vertex, a numerical feature value is generated. The numerical feature value represents a proximity of the respective vertex to other vertices of the plurality of vertices that have a categorical feature value corresponding to the respective potential categorical feature value. Using the numerical feature values for each vertex, proximity encoding data is generated representing said input graph. The proximity encoding data is used to efficiently train machine learning models that produce results with enhanced accuracy.

Patent Agency Ranking