Distributed graph processing system featuring interactive remote control mechanism including task cancellation

    公开(公告)号:US10318355B2

    公开(公告)日:2019-06-11

    申请号:US15413811

    申请日:2017-01-24

    Abstract: Techniques herein provide job control and synchronization of distributed graph-processing jobs. In an embodiment, a computer system maintains an input queue of graph processing jobs. In response to de-queuing a graph processing job, a master thread partitions the graph processing job into distributed jobs. Each distributed job has a sequence of processing phases. The master thread sends each distributed job to a distributed processor. Each distributed job executes a first processing phase of its sequence of processing phases. To the master thread, the distributed job announces completion of its first processing phase. The master thread detects that all distributed jobs have announced finishing their first processing phase. The master thread broadcasts a notification to the distributed jobs that indicates that all distributed jobs have finished their first processing phase. Receiving that notification causes the distributed jobs to execute their second processing phase. Queues and barriers provide for faults and cancellation.

    Finding common neighbors between two nodes in a graph

    公开(公告)号:US10157239B2

    公开(公告)日:2018-12-18

    申请号:US14139237

    申请日:2013-12-23

    Abstract: Techniques for identifying common neighbors of two nodes in a graph are provided. One technique involves performing a binary split search and/or a linear search. Another technique involves creating a segmenting index for a first neighbor list. A second neighbor list is scanned and, for each node indicated in the second neighbor list, the segmenting index is used to determine whether the node is also indicated in the first neighbor list. Techniques are also provided for counting the number of triangles. One technique involves pruning nodes from neighbor lists based on the node values of the nodes whose neighbor lists are being pruned. Another technique involves sorting the nodes in a node array (and, thus, their respective neighbor lists) based on the nodes' respective degrees prior to identifying common neighbors. In this way, when pruning the neighbor lists, the neighbor lists of the highly connected nodes are significantly reduced.

    FAST PROCESSING OF PATH-FINDING QUERIES IN LARGE GRAPH DATABASES
    157.
    发明申请
    FAST PROCESSING OF PATH-FINDING QUERIES IN LARGE GRAPH DATABASES 审中-公开
    在大型图形数据库中快速处理路径查找问题

    公开(公告)号:US20170060958A1

    公开(公告)日:2017-03-02

    申请号:US14837696

    申请日:2015-08-27

    Abstract: Techniques herein are for fast processing of path-finding queries in large graph databases. A computer system receives a graph search request to find a set of result paths between one or more source vertices of a graph and one or more target vertices of the graph. The graph comprises vertices connected by edges. During a first pass, the computer system performs one or more breadth-first searches to identify a subset of edges of the graph. The one or more breadth-first searches originate at the one or more source vertices. After the first pass and during a second pass, the computer system performs one or more depth-first searches to identify the set of result paths. The one or more depth-first searches originate at the one or more target vertices. The one or more depth-first searches traverse at most the subset of edges of the graph.

    Abstract translation: 这里的技术是用于在大图数据库中快速处理路径查找查询。 计算机系统接收图形搜索请求以找到图形的一个或多个源顶点与该图的一个或多个目标顶点之间的一组结果路径。 该图包括通过边缘连接的顶点。 在第一次通过期间,计算机系统执行一个或多个宽度优先搜索以识别图的边缘的子集。 一个或多个宽度优先搜索起源于一个或多个源顶点。 在第一次通过和第二遍之后,计算机系统执行一个或多个深度优先搜索以识别该组结果路径。 一个或多个深度优先搜索起始于一个或多个目标顶点。 一个或多个深度优先搜索最多遍历图形边缘的子集。

    AUTOMATIC GENERATION OF MULTI-SOURCE BREADTH-FIRST SEARCH FROM HIGH-LEVEL GRAPH LANGUAGE
    158.
    发明申请
    AUTOMATIC GENERATION OF MULTI-SOURCE BREADTH-FIRST SEARCH FROM HIGH-LEVEL GRAPH LANGUAGE 审中-公开
    从高级图表语言搜索多媒体的自动生成

    公开(公告)号:US20160335322A1

    公开(公告)日:2016-11-17

    申请号:US14710117

    申请日:2015-05-12

    CPC classification number: G06F17/30958

    Abstract: Techniques are described herein for automatic generation of multi-source breadth-first search (MS-BFS) from high-level graph processing language. In an embodiment, a method involves a computer analyzing original software instructions. The original software instructions are configured to perform multiple breadth-first searches to determine a particular result. Each breadth-first search originates at each of a subset of vertices of a graph. Each breadth-first search is encoded for independent execution. Based on the analyzing, the computer generates transformed software instructions configured to perform a MS-BFS to determine the particular result. Each of the subset of vertices is a source of the MS-BFS. In an embodiment, parallel execution of the MS-BFS is regulated with batches of vertices. In an embodiment, the original software instructions are expressed in Green-Marl graph analysis language. In an embodiment, the transformed software instructions are expressed in a general purpose programing language such as C, C++, Python, or Java.

    Abstract translation: 本文描述了用于从高级图处理语言自动生成多源宽度优先搜索(MS-BFS)的技术。 在一个实施例中,一种方法涉及计算机分析原始软件指令。 原始软件指令被配置为执行多个宽度优先搜索以确定特定结果。 每个宽度优先搜索起始于图形的顶点的每个子集。 每个宽度优先的搜索被编码用于独立执行。 基于分析,计算机生成经配置以执行MS-BFS以确定特定结果的变换软件指令。 顶点子集中的每一个都是MS-BFS的源。 在一个实施例中,通过批次的顶点来调节MS-BFS的并行执行。 在一个实施例中,原始软件指令以格林 - 马尔图分析语言表示。 在一个实施例中,变换的软件指令以诸如C,C ++,Python或Java的通用程序语言表达。

    Graph-data partitioning for workload-balanced distributed computation with cost estimation functions
    159.
    发明授权
    Graph-data partitioning for workload-balanced distributed computation with cost estimation functions 有权
    用于具有成本估算功能的工作负载均衡分布式计算的图形数据分区

    公开(公告)号:US09477532B1

    公开(公告)日:2016-10-25

    申请号:US14876075

    申请日:2015-10-06

    CPC classification number: G06F9/5083 G06F9/4881 G06F2209/5022

    Abstract: Techniques herein perform workload-balanced graph partitioning. Each graph partition is distributed to a respective computer. Each computer applies a workload-estimation function to its partition to calculate a numeric workload-value that indicates how much computation the partition needs. Each computer sends its numeric workload-value to a master computer. The master compares the highest and lowest numeric workload-values. If the difference exceeds a threshold, the master detects how much work should overloaded-computers offload to under-utilized computers. To each overloaded-computer, the master sends a directive with a balancing numeric workload-value that indicates how much computation to offload and an identifier of an under-utilized computer to receive the offload. Based on this directive and the workload-estimation function, an overloaded-computer selects a portion of its partition that corresponds to the balancing numeric workload-value, removes that portion from its partition, and transfers the portion to the under-utilized computer, which adds the portion to its partition.

    Abstract translation: 这里的技术执行工作负载平衡图分割。 每个图形分区都分配给相应的计算机。 每个计算机将其工作负载估计功能应用于其分区,以计算数字工作负载值,该值指示分区需要多少计算。 每个计算机将其数字工作负载值发送到主计算机。 主人比较最高和最低数值工作负载值。 如果差异超过阈值,则主机检测到应该超载多少工作 - 计算机卸载到未充分利用的计算机。 对于每台重载计算机,主机发送一个指令,其中包含一个平衡数字工作负载值,指示卸载多少计算和一个未充分利用的计算机的标识符来接收卸载。 基于该指令和工作负载估计功能,重载计算机选择其对应于平衡数字工作负载值的分区的一部分,从其分区中移除该部分,并将该部分传送到未充分利用的计算机,其中 将该部分添加到其分区。

    DISTRIBUTED GRAPH PROCESSING SYSTEM THAT SUPPORT REMOTE DATA READ WITH PROACTIVE BULK DATA TRANSFER
    160.
    发明申请
    DISTRIBUTED GRAPH PROCESSING SYSTEM THAT SUPPORT REMOTE DATA READ WITH PROACTIVE BULK DATA TRANSFER 审中-公开
    分布式图形处理系统,支持远程数据读取与主动大容量数据传输

    公开(公告)号:US20160292303A1

    公开(公告)日:2016-10-06

    申请号:US14678358

    申请日:2015-04-03

    CPC classification number: G06F16/9024 G06F16/254

    Abstract: Techniques for generating and transferring bulk messages from one computing device to another computing device in a cluster are provided. Each computing device in a cluster is assigned a different set of nodes of a graph. A first computing device may be assigned a particular node that is neighbors with multiple other nodes that are assigned to one or more other computing devices in the cluster. When processing graph-related code at the first computing device, information about the neighbors may be required. The first computing device receives a bulk message from one of the other computing devices. The bulk message contains information about at least a subset of the neighbors. Therefore, the first computing device is not required to send multiple messages for information about the subset of neighbors. In fact, the first computing device is not required to send any message for the information.

    Abstract translation: 提供了用于从批量消息从一个计算设备到群集中的另一计算设备的生成和传送的技术。 集群中的每个计算设备都分配了一组不同图形的节点。 可以向第一计算设备分配与分配给集群中的一个或多个其他计算设备的多个其他节点相邻的特定节点。 当在第一计算设备处理图形相关代码时,可能需要有关邻居的信息。 第一计算设备从其他计算设备之一接收批量消息。 批量消息包含有关邻居的至少一个子集的信息。 因此,第一计算设备不需要发送关于邻居子集的信息的多个消息。 事实上,第一个计算设备不需要发送任何消息的信息。

Patent Agency Ranking