FAST AND MEMORY-EFFICIENT DISTRIBUTED GRAPH MUTATIONS

    公开(公告)号:US20230237047A1

    公开(公告)日:2023-07-27

    申请号:US17585117

    申请日:2022-01-26

    CPC classification number: G06F16/2379 G06F16/9024

    Abstract: Data structures and methods are described for applying mutations on a distributed graph in a fast and memory-efficient manner. Nodes in a distributed graph processing system may store graph information such as vertices, edges, properties, vertex keys, vertex degree counts, and other information in graph arrays, which are divided into shared arrays and delta logs. The shared arrays on a local node remain immutable and are the starting point of a graph, on top of which mutations build new snapshots. Mutations may be supported at both the entity and table levels. Periodic delta log consolidation may occur at multiple levels to prevent excessive delta log buildup. Consolidation at the table level may also trigger rebalancing of vertices across the nodes.

    AUTOMATED DATASET DRIFT DETECTION
    112.
    发明申请

    公开(公告)号:US20230139718A1

    公开(公告)日:2023-05-04

    申请号:US17513760

    申请日:2021-10-28

    Abstract: Herein are acceleration and increased reliability based on classification and scoring techniques for machine learning that compare two similar datasets of different ages to detect data drift without a predefined drift threshold. Various subsets are randomly sampled from the datasets. The subsets are combined in various ways to generate subsets of various age mixtures. In an embodiment, ages are permuted and drift is detected based on whether or not fitness scores indicate that an age binary classifier is confused. In an embodiment, an anomaly detector measures outlier scores of two subsets of different age mixtures. Drift is detected when the outlier scores diverge. In a two-arm bandit embodiment, iterations randomly alternate between both datasets based on respective probabilities that are adjusted by a bandit reward based on outlier scores from an anomaly detector. Drift is detected based on the probability of the younger dataset.

    DUPLICATION ELIMINATION IN DEPTH BASED SEARCHES FOR DISTRIBUTED SYSTEMS

    公开(公告)号:US20220179859A1

    公开(公告)日:2022-06-09

    申请号:US17116831

    申请日:2020-12-09

    Abstract: Systems and methods for improving evaluation of graph queries through depth first traversals are described herein. In an embodiment, a multi-node system evaluates against graph data a graph query that specifies a particular pattern to match by determining, at a first node of the multi-node system, in a particular instance of evaluating the graph query, that one or more first vertices on the first node match a first portion of the graph query and that a second vertex that is to be evaluated next is stored on a second node separate from the first node. In response to determining that the next vertex to be evaluated is stored on the second node separate from the first node, the first node generates a message to the second node comprising one or more results of the first portion of the graph query based on the one or more first vertices, an identifier of the next vertex, and a current stage of evaluating the graph query. In response to generating the message from the first node to the second node, the first node ceases the particular instance of evaluating the graph query.

    Efficient, in-memory, relational representation for heterogeneous graphs

    公开(公告)号:US11120082B2

    公开(公告)日:2021-09-14

    申请号:US15956115

    申请日:2018-04-18

    Abstract: Techniques are provided herein for efficient representation of heterogeneous graphs in memory. In an embodiment, vertices and edges of the graph are segregated by type. Each property of a type of vertex or edge has values stored in a respective vector. Directed or undirected edges of a same type are stored in compressed sparse row (CSR) format. The CSR format is more or less repeated for edge traversal in either forward or reverse direction. An edge map translates edge offsets obtained from traversal in the reverse direction for use with data structures that expect edge offsets in the forward direction. Subsequent filtration and/or traversal by type or property of vertex or edge entails minimal data access and maximal data locality, thereby increasing efficient use of the graph.

Patent Agency Ranking