-
公开(公告)号:US20210373938A1
公开(公告)日:2021-12-02
申请号:US16883317
申请日:2020-05-26
Applicant: Oracle International Corporation
Inventor: Petr Koupy , Vasileios Trigonakis , Iraklis Psaroudakis , Jinsoo Lee , Sungpack Hong , Hassan Chafi
IPC: G06F9/48 , G06F9/448 , G06F9/38 , G06F11/07 , G06F16/901
Abstract: In an embodiment, a computer of a cluster of computers receives graph logic that specifies a sequence of invocations, including a current invocation and a next invocation, of parallelism operations that can detect whether the graph logic should prematurely terminate. The computer initiates, on the computers of the cluster, execution of the graph logic to process a distributed graph. Before the current invocation, the graph logic registers reversion logic for a modification of the distributed graph that execution of the graph logic has caused. During the current invocation, it is detected that the graph logic should prematurely terminate. Execution of the graph logic on the cluster is terminated without performing the next invocation in the sequence of invocations. The reversion logic reverses the modification of the distributed graph to restore consistency. The distributed graph is retained in volatile memory of the cluster for reuse such as relaunch of the graph logic.
-
42.
公开(公告)号:US20210224235A1
公开(公告)日:2021-07-22
申请号:US16747827
申请日:2020-01-21
Applicant: Oracle International Corporation
Inventor: Marco Arnaboldi , Jean-Pierre Lozi , Laurent Phillipe Daynes , Vlad Ioan Haprian , Shasank Kisan Chavan , Hugo Kapp , Sungpack Hong
IPC: G06F16/21 , G06F16/901 , G06F16/28 , G06F16/27
Abstract: Herein are techniques that concurrently populate entries in a compressed sparse row (CSR) encoding, of a type of edge of a heterogenous graph. In an embodiment, a computer obtains a mapping of a relational schema to a graph data model. The relational schema defines vertex tables that correspond to vertex types in the graph data model, and edge tables that correspond to edge types in the graph data model. Each edge type is associated with a source vertex type and a target vertex type. For each vertex type, a sequence of persistent identifiers of vertices is obtained. Based on the mapping and for a CSR representation of each edge type, a source array is populated that, for a same vertex ordering as the sequence of persistent identifiers for the source vertex type, is based on counts of edges of the edge type that originate from vertices of the source vertex type. For the CSR, the computer populates, in parallel and based on said mapping, a destination array that contains canonical offsets as sequence positions within the sequence of persistent identifiers of the vertices.
-
公开(公告)号:US10902203B2
公开(公告)日:2021-01-26
申请号:US16392386
申请日:2019-04-23
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Davide Bartolini , Sungpack Hong , Hassan Chafi , Alberto Parravicini
IPC: G06F40/295 , G06N5/02
Abstract: Techniques are described herein for performing named entity disambiguation. According to an embodiment, a method includes receiving input text, extracting a first mention and a second mention from the input text, and selecting, from a knowledge graph, a plurality of first candidate vertices for the first mention and a plurality of second candidate vertices for the second mention. The present method also includes evaluating a score function that analyzes vertex embedding similarity between the plurality of first candidate vertices and the plurality of second candidate vertices. In response to evaluating and seeking to optimize the score function, the method performs selecting a first selected candidate vertex from the plurality of first candidate vertices and a second selected candidate vertex from the plurality of second candidate vertices. Further, the present method includes mapping a first entry from the knowledge graph to the first mention and mapping a second entry from the knowledge graph to the second mention. In this embodiment, the first entry corresponds to the first selected candidate vertex and the second entry corresponds to the second selected candidate.
-
公开(公告)号:US10831829B2
公开(公告)日:2020-11-10
申请号:US16185236
申请日:2018-11-09
Applicant: Oracle International Corporation
Inventor: Martin Sevenich , Sungpack Hong , Alexander Weld , Hassan Chafi
IPC: G06F16/00 , G06F16/901 , G06F16/9532 , G06F16/22
Abstract: Embodiments perform real-time vertex connectivity checks in graph data representations via a multi-phase search process. This process includes an efficient first search phase using landmark connectivity data that is generated during a preprocessing phase. Landmark connectivity data maps the connectivity of a set of identified landmarks in a graph to other vertices in the graph. Upon determining that the subject vertices are not closely related via landmarks, embodiments implement a second search phase that performs a brute-force search for connectivity, between the subject vertices, among the graph's non-landmark vertices. This brute-force search prevents exploration of cyclical paths by recording the vertices on a currently-explored path in a stack data structure. The second search phase is automatically aborted upon detecting that the non-landmark vertices in the graph are over a threshold density. In this case, embodiments perform a third search phase involving either a modified breadth-first search or modified bidirectional search.
-
公开(公告)号:US10795672B2
公开(公告)日:2020-10-06
申请号:US16176853
申请日:2018-10-31
Applicant: Oracle International Corporation
Inventor: Martijn Dwars , Martin Sevenich , Sungpack Hong , Hassan Chafi
IPC: G06F8/75 , G06F8/51 , G06F8/41 , G06F8/30 , G06F16/901
Abstract: Techniques are described herein for automatic generation of multi-source breadth-first search (MS-BFS) from high-level graph processing language that can be executed in a distributed computing environment. In an embodiment, a method involves a computer analyzing original software instructions. The original software instructions are configured to perform multiple breadth-first searches to determine a particular result. Each breadth-first search originates at each of a subset of vertices of a graph. Each breadth-first search is encoded for independent execution. Based on the analyzing, the computer generates transformed software instructions configured to perform a MS-BFS to determine the particular result. Each of the subset of vertices is a source of the MS-BFS. In an embodiment, the second plurality of software instructions comprises a node iteration loop and a neighbor iteration loop, and the plurality of vertices of the distributed graph comprise active vertices and neighbor vertices. The node iteration loop is configured to iterate once per each active vertex of the plurality of vertices of the distributed graph, and the node iteration loop is configured to determine the particular result. The neighbor iteration loop is configured to iterate once per each active vertex of the plurality of vertices of the distributed graph, and each iteration of the neighbor iteration loop is configured to activate one or more neighbor vertices of the plurality of vertices for the following iteration of the neighbor iteration loop.
-
46.
公开(公告)号:US20200073868A1
公开(公告)日:2020-03-05
申请号:US16378424
申请日:2019-04-08
Applicant: Oracle International Corporation
Inventor: Arnaud Delamare , Vasileios Trigonakis , Vlad Ioan Haprian , Oskar Van Rest , Sungpack Hong , Hassan Chafi , Tomas Faltin , Jean-Pierre Lozi
IPC: G06F16/2453 , G06F16/901 , G06F16/22 , H03M7/30
Abstract: Techniques are described herein for space-efficient encoding of label information of property graphs. In an embodiment, an input graph is received. The input graph comprises a plurality of entities and a plurality of label sets. Each entity of said plurality of entities is associated with a label set of the plurality of label sets and each label set of the plurality of label sets comprises zero or more labels of a plurality of labels. A first mapping is generated that maps each label of the plurality of labels to a label code. A second mapping is generated that maps each label integer set of a plurality of label integer sets to a label code. Each label integer set of the plurality of label integer sets corresponds to a label set of the plurality of label sets, wherein each label integer set of the plurality of label integer sets comprises label codes from the first mapping that are mapped to each label included in the corresponding label set. A compressed label set is generated for each entity of the plurality of entities. Each compressed label set comprises a plurality of bits that indicate a zeroth state, a first state, a second state, or a third state. The compressed label sets and the first and second mappings are used to efficiently evaluate graph label queries.
-
47.
公开(公告)号:US20190205178A1
公开(公告)日:2019-07-04
申请号:US16353050
申请日:2019-03-14
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Jinsu Lee , Sungpack Hong , Siegfried Depner , Nicholas Roth , Thomas Manhardt , Hassan Chafi
CPC classification number: G06F9/5066 , G06F9/546
Abstract: Techniques herein provide job control and synchronization of distributed graph-processing jobs. In an embodiment, a computer system maintains an input queue of graph processing jobs. In response to de-queuing a graph processing job, a master thread partitions the graph processing job into distributed jobs. Each distributed job has a sequence of processing phases. The master thread sends each distributed job to a distributed processor. Each distributed job executes a first processing phase of its sequence of processing phases. To the master thread, the distributed job announces completion of its first processing phase. The master thread detects that all distributed jobs have announced finishing their first processing phase. The master thread broadcasts a notification to the distributed jobs that indicates that all distributed jobs have finished their first processing phase. Receiving that notification causes the distributed jobs to execute their second processing phase. Queues and barriers provide for faults and cancellation.
-
公开(公告)号:US20190171490A1
公开(公告)日:2019-06-06
申请号:US16270135
申请日:2019-02-07
Applicant: Oracle International Corporation
Inventor: Thomas Manhardt , Sungpack Hong , Siegfried Depner , Jinsu Lee , Nicholas Roth , Hassan Chafi
Abstract: Techniques are provided for dynamically self-balancing communication and computation. In an embodiment, each partition of application data is stored on a respective computer of a cluster. The application is divided into distributed jobs, each of which corresponds to a partition. Each distributed job is hosted on the computer that hosts the corresponding data partition. Each computer divides its distributed job into computation tasks. Each computer has a pool of threads that execute the computation tasks. During execution, one computer receives a data access request from another computer. The data access request is executed by a thread of the pool. Threads of the pool are bimodal and may be repurposed between communication and computation, depending on workload. Each computer individually detects completion of its computation tasks. Each computer informs a central computer that its distributed job has finished. The central computer detects when all distributed jobs of the application have terminated.
-
49.
公开(公告)号:US20180349002A1
公开(公告)日:2018-12-06
申请号:US15609629
申请日:2017-05-31
Applicant: Oracle International Corporation
Inventor: Julia Kindelsberger , Daniel Langerenken , Korbinian Schmid , Sungpack Hong , Hassan Chafi
IPC: G06F3/0484 , G06T11/20 , G06F17/22 , G06F3/0482
Abstract: Techniques herein organize and display as branches the historical versions of filtrations of a property graph in a way that suits interactive exploration. In embodiments, a computer loads metadata that describes versions of filtration of a graph that contains vertices interconnected by edges. Based on the metadata, the computer displays, along a timeline, version indicators that each represents a respective historical version of filtration of the graph. The computer displays, responsive to receiving an interactive selection of a particular version indicator of the plurality of version indicators, a particular version of filtration of the graph that is represented by the particular version indicator. Subsequences of versions for the timeline may be organized as branches that may be interactively created and merged. Branching and merging are integrated into the general lifecycle of graph filtration. A version timeline may be presented and operated as a tool for historical navigation and speculative exploration.
-
50.
公开(公告)号:US10133827B2
公开(公告)日:2018-11-20
申请号:US14710117
申请日:2015-05-12
Applicant: Oracle International Corporation
Inventor: Manuel Then , Sungpack Hong , Martin Sevenich , Hassan Chafi
IPC: G06F17/30
Abstract: Techniques are described herein for automatic generation of multi-source breadth-first search (MS-BFS) from high-level graph processing language. In an embodiment, a method involves a computer analyzing original software instructions. The original software instructions are configured to perform multiple breadth-first searches to determine a particular result. Each breadth-first search originates at each of a subset of vertices of a graph. Each breadth-first search is encoded for independent execution. Based on the analyzing, the computer generates transformed software instructions configured to perform a MS-BFS to determine the particular result. Each of the subset of vertices is a source of the MS-BFS. In an embodiment, parallel execution of the MS-BFS is regulated with batches of vertices. In an embodiment, the original software instructions are expressed in Green-Marl graph analysis language. In an embodiment, the transformed software instructions are expressed in a general purpose programming language such as C, C++, Python, or Java.
-
-
-
-
-
-
-
-
-