-
121.
公开(公告)号:US20230281219A1
公开(公告)日:2023-09-07
申请号:US17686938
申请日:2022-03-04
Applicant: Oracle International Corporation
Inventor: Jinsu Lee , Petr Koupy , Vasileios Trigonakis , Sungpack Hong , Hassan Chafi
CPC classification number: G06F16/27 , G06F16/284 , G06F16/2282
Abstract: In an embodiment, multiple computers cooperate to retrieve content from tables in a relational database. Each table contains respective rows. Each row contains a vertex of a graph. Many high-degree vertices are identified. Each high-degree vertex is connected to respective edges in the graph. A count of the edges of each high-degree vertex exceeds a degree threshold. A central computer detects that all vertices in a high-degree subset of tables are high-degree vertices. Based on detecting the high-degree subset of tables, multiple vertices of the graph that are not in the high-degree subset of tables are replicated. Within local storage capacity limits of the computers, this degree-based replication may be supplemented with other vertex replication strategies that are schema based, content based, or workload based. This intelligent selective replication maximizes system throughput by minimizing graph data access latency based on data locality.
-
公开(公告)号:US20230252077A1
公开(公告)日:2023-08-10
申请号:US17584262
申请日:2022-01-25
Applicant: Oracle International Corporation
Inventor: Hugo Kapp , Laurent Daynes , Vlad Ioan Haprian , Jean-Pierre Lozi , Zhen Hua Liu , Marco Arnaboldi , Sabina Petride , Andrew Witkowski , Hassan Chafi , Sungpack Hong
IPC: G06F16/901 , G06F16/903
CPC classification number: G06F16/9024 , G06F16/90335
Abstract: Techniques described herein allow a user of an RDBMS to specify a graph algorithm function (GAF) declaration, which defines a graph algorithm that takes a graph object as input and returns a logical graph object as output. A database dictionary stores the GAF declaration, which allows addition of GAFs without changing the RDBMS kernel. GAFs are used within graph queries to compute output properties of property graph objects. Output properties are accessible in the enclosing graph pattern matching query, and are live for the duration of the query cursor execution. According to various embodiments, the declaration of a GAF includes a DESCRIBE function, used for semantic analysis of the GAF, and an EXECUTE function, which defines the operations performed by the GAF. Furthermore, composition of GAFs in a graph query is done by supplying, as the input graph argument of an outer GAF, the result of an inner GAF.
-
公开(公告)号:US20230237047A1
公开(公告)日:2023-07-27
申请号:US17585117
申请日:2022-01-26
Applicant: Oracle International Corporation
Inventor: Vasileios Trigonakis , Paul Renauld , Jinsu Lee , Petr Koupy , Sungpack Hong , Hassan Chafi
IPC: G06F16/23 , G06F16/901
CPC classification number: G06F16/2379 , G06F16/9024
Abstract: Data structures and methods are described for applying mutations on a distributed graph in a fast and memory-efficient manner. Nodes in a distributed graph processing system may store graph information such as vertices, edges, properties, vertex keys, vertex degree counts, and other information in graph arrays, which are divided into shared arrays and delta logs. The shared arrays on a local node remain immutable and are the starting point of a graph, on top of which mutations build new snapshots. Mutations may be supported at both the entity and table levels. Periodic delta log consolidation may occur at multiple levels to prevent excessive delta log buildup. Consolidation at the table level may also trigger rebalancing of vertices across the nodes.
-
公开(公告)号:US20230139718A1
公开(公告)日:2023-05-04
申请号:US17513760
申请日:2021-10-28
Applicant: Oracle International Corporation
Inventor: Mojtaba Valipour , Yasha Pushak , Robert Harlow , Hesam Fathi Moghadam , Sungpack Hong , Hassan Chafi
Abstract: Herein are acceleration and increased reliability based on classification and scoring techniques for machine learning that compare two similar datasets of different ages to detect data drift without a predefined drift threshold. Various subsets are randomly sampled from the datasets. The subsets are combined in various ways to generate subsets of various age mixtures. In an embodiment, ages are permuted and drift is detected based on whether or not fitness scores indicate that an age binary classifier is confused. In an embodiment, an anomaly detector measures outlier scores of two subsets of different age mixtures. Drift is detected when the outlier scores diverge. In a two-arm bandit embodiment, iterations randomly alternate between both datasets based on respective probabilities that are adjusted by a bandit reward based on outlier scores from an anomaly detector. Drift is detected based on the probability of the younger dataset.
-
公开(公告)号:US20230121198A1
公开(公告)日:2023-04-20
申请号:US18084406
申请日:2022-12-19
Applicant: Oracle International Corporation
Inventor: Petr Koupy , Thomas Manhardt , Siegfried Depner , Sungpack Hong , Hassan Chafi
Abstract: Techniques herein minimally communicate between computers to repartition a graph. In embodiments, each computer receives a partition of edges and vertices of the graph. For each of its edges or vertices, each computer stores an intermediate representation into an edge table (ET) or vertex table. Different edges of a vertex may be loaded by different computers, which may cause a conflict. Each computer announces that a vertex resides on the computer to a respective tracking computer. Each tracking computer makes assignments of vertices to computers and publicizes those assignments. Each computer that loaded conflicted vertices transfers those vertices to computers of the respective assignments. Each computer stores a materialized representation of a partition based on: the ET and vertex table of the computer, and the vertices and edges that were transferred to the computer. Edges stored in the materialized representation are stored differently than edges stored in the ET.
-
公开(公告)号:US11630864B2
公开(公告)日:2023-04-18
申请号:US16803832
申请日:2020-02-27
Applicant: Oracle International Corporation
Inventor: Benjamin Schlegel , Martin Sevenich , Pit Fender , Matthias Brantner , Hassan Chafi
IPC: G06F16/901 , G06F9/38 , G06F9/54
Abstract: Techniques are described for a vectorized queue, which implements a vectorized ‘contains’ function that determines whether a value is in the queue. A three-phase vectorized shortest-path graph search splits each expanding and probing iteration into three phases that utilize vectorized instructions: (1) The neighbors of nodes that are in a next queue are fetched and written into a current queue. (2) It is determined whether the destination node is among the fetched neighbor nodes in the current queue. (3) The fetched neighbor nodes that have not yet been visited are put into the next queue. According to an embodiment, a vectorized copy operation performs vector-based data copying using vectorized load and store instructions. Specifically, vectors of data are copied from a source to a destination. Any invalid data copied to the destination is overwritten, either with a vector of additional valid data or with a vector of nonce data.
-
公开(公告)号:US11573793B2
公开(公告)日:2023-02-07
申请号:US16822009
申请日:2020-03-18
Applicant: Oracle International Corporation
Inventor: Harshad Kasture , Matthias Brantner , Hassan Chafi , Benjamin Schlegel , Pit Fender
Abstract: Techniques are provided for lazy push optimization, allowing for constant time push operations. A d-heap is used as the underlying data structure for indexing values being inserted. The d-heap is vectorized by storing values in a contiguous memory array. Heapify operations are delayed until a retrieve operation occurs, improving insert performance of vectorized d-heaps that use horizontal aggregation SIMD instructions at the cost of slightly lower retrieve performance.
-
公开(公告)号:US11561780B2
公开(公告)日:2023-01-24
申请号:US17069104
申请日:2020-10-13
Applicant: Oracle International Corporation
Inventor: Petr Koupy , Thomas Manhardt , Siegfried Depner , Sungpack Hong , Hassan Chafi
IPC: G06F16/00 , G06F8/41 , G06F9/50 , G06F17/16 , G06F16/27 , G06F17/10 , G06F11/34 , G06F3/06 , G06F11/10 , G06F16/901
Abstract: Techniques herein minimally communicate between computers to repartition a graph. In embodiments, each computer receives a partition of edges and vertices of the graph. For each of its edges or vertices, each computer stores an intermediate representation into an edge table (ET) or vertex table. Different edges of a vertex may be loaded by different computers, which may cause a conflict. Each computer announces that a vertex resides on the computer to a respective tracking computer. Each tracking computer makes assignments of vertices to computers and publicizes those assignments. Each computer that loaded conflicted vertices transfers those vertices to computers of the respective assignments. Each computer stores a materialized representation of a partition based on: the ET and vertex table of the computer, and the vertices and edges that were transferred to the computer. Edges stored in the materialized representation are stored differently than edges stored in the ET.
-
公开(公告)号:US11385889B2
公开(公告)日:2022-07-12
申请号:US16703499
申请日:2019-12-04
Applicant: Oracle International Corporation
Inventor: Pit Fender , Benjamin Schlegel , Matthias Brantner , Harshad Kasture , Hassan Chafi
IPC: G06F8/71
Abstract: Herein are machine learning (ML) feature processing and analytic techniques to detect anomalies in parse trees of logic statements, database queries, logic scripts, compilation units of general-purpose programing language, extensible markup language (XML), JAVASCRIPT object notation (JSON), and document object models (DOM). In an embodiment, a computer identifies an operational trace that contains multiple parse trees. Values of explicit features are generated from a single respective parse tree of the multiple parse trees of the operational trace. Values of implicit features are generated from more than one respective parse tree of the multiple parse trees of the operational trace. The explicit and implicit features are stored into a same feature vector. With the feature vector as input, an ML model detects whether or not the operational trace is anomalous, based on the explicit features of each parse tree of the operational trace and the implicit features of multiple parse trees of the operational trace.
-
公开(公告)号:US20220215055A1
公开(公告)日:2022-07-07
申请号:US17141018
申请日:2021-01-04
Applicant: Oracle International Corporation
Inventor: Arnaud Delamare , Vasileios Trigonakis , Yahya Ez-Zainabi , Sungpack Hong , Hassan Chafi
IPC: G06F16/901
Abstract: Techniques are provided for finding unused vertex and edge identifiers (IDs) in a distributed graph engine. A run-time data structure may be built during the loading of the graph. The data structure identifies unavailable IDs that are associated with graph entities of the graph. The data structure is traversed to determine one or more ranges of free IDs. Unused IDs are generated from the ranges.
-
-
-
-
-
-
-
-
-