-
公开(公告)号:US10902203B2
公开(公告)日:2021-01-26
申请号:US16392386
申请日:2019-04-23
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Davide Bartolini , Sungpack Hong , Hassan Chafi , Alberto Parravicini
IPC: G06F40/295 , G06N5/02
Abstract: Techniques are described herein for performing named entity disambiguation. According to an embodiment, a method includes receiving input text, extracting a first mention and a second mention from the input text, and selecting, from a knowledge graph, a plurality of first candidate vertices for the first mention and a plurality of second candidate vertices for the second mention. The present method also includes evaluating a score function that analyzes vertex embedding similarity between the plurality of first candidate vertices and the plurality of second candidate vertices. In response to evaluating and seeking to optimize the score function, the method performs selecting a first selected candidate vertex from the plurality of first candidate vertices and a second selected candidate vertex from the plurality of second candidate vertices. Further, the present method includes mapping a first entry from the knowledge graph to the first mention and mapping a second entry from the knowledge graph to the second mention. In this embodiment, the first entry corresponds to the first selected candidate vertex and the second entry corresponds to the second selected candidate.
-
公开(公告)号:US11120082B2
公开(公告)日:2021-09-14
申请号:US15956115
申请日:2018-04-18
Applicant: Oracle International Corporation
Inventor: Damien Hilloulin , Davide Bartolini , Oskar Van Rest , Alexander Weld , Sungpack Hong , Hassan Chafi
IPC: G06F16/901 , G06F16/28 , G06F16/22
Abstract: Techniques are provided herein for efficient representation of heterogeneous graphs in memory. In an embodiment, vertices and edges of the graph are segregated by type. Each property of a type of vertex or edge has values stored in a respective vector. Directed or undirected edges of a same type are stored in compressed sparse row (CSR) format. The CSR format is more or less repeated for edge traversal in either forward or reverse direction. An edge map translates edge offsets obtained from traversal in the reverse direction for use with data structures that expect edge offsets in the forward direction. Subsequent filtration and/or traversal by type or property of vertex or edge entails minimal data access and maximal data locality, thereby increasing efficient use of the graph.
-
公开(公告)号:US20200342055A1
公开(公告)日:2020-10-29
申请号:US16392386
申请日:2019-04-23
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Davide Bartolini , Sungpack Hong , Hassan Chafi , Alberto Parravicini
Abstract: Techniques are described herein for performing named entity disambiguation. According to an embodiment, a method includes receiving input text, extracting a first mention and a second mention from the input text, and selecting, from a knowledge graph, a plurality of first candidate vertices for the first mention and a plurality of second candidate vertices for the second mention. The present method also includes evaluating a score function that analyzes vertex embedding similarity between the plurality of first candidate vertices and the plurality of second candidate vertices. In response to evaluating and seeking to optimize the score function, the method performs selecting a first selected candidate vertex from the plurality of first candidate vertices and a second selected candidate vertex from the plurality of second candidate vertices. Further, the present method includes mapping a first entry from the knowledge graph to the first mention and mapping a second entry from the knowledge graph to the second mention. In this embodiment, the first entry corresponds to the first selected candidate vertex and the second entry corresponds to the second selected candidate.
-
公开(公告)号:US11526673B2
公开(公告)日:2022-12-13
申请号:US17153078
申请日:2021-01-20
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Davide Bartolini , Sungpack Hong , Hassan Chafi , Alberto Parravicini
IPC: G06F40/295 , G06N5/02
Abstract: According to an embodiment, a method includes converting a knowledge base into a graph. In this embodiment, the knowledge base contains a plurality of entities and specifies a plurality of relationships among the plurality of entities, and entities in the knowledge base correspond to vertices in the graph, and relationships between entities in the knowledge base correspond to edges between vertices in the graph. The method may also include extracting a plurality of vertex embeddings from the graph. An example vertex embedding of the plurality of vertex embeddings represents, for a particular vertex, a proximity of the particular vertex to other vertices of the graph. Further, the method may include performing, based at least in part on the plurality of vertex embeddings, entity linking between input text and the knowledge base.
-
5.
公开(公告)号:US20200265090A1
公开(公告)日:2020-08-20
申请号:US16280591
申请日:2019-02-20
Applicant: Oracle International Corporation
Inventor: Damien Hilloulin , Davide Bartolini , Oskar Van Rest , Vlad Haprian, , Sungpack Hong , Hassan Chafi,
IPC: G06F16/901 , G06F16/903
Abstract: Herein are computerized techniques for processing a heterogeneous graph according to scan-avoidant query planning. In an embodiment, a computer respectively stores a first and second kind of vertices of a property graph into a first and second vertex tables. The computer generates, without scanning the second vertex table: a) an initial partial result of a query of the property graph based on the first vertex table, and b) a subsequent partial result of the query based on the initial partial result and the second kind of vertices. Herein are graph encodings that are dense, without requiring extra computation, and that exploit graph heterogeneity to achieve an aggregation granularity that reduces data working set scope, optimizes for caching, and encourages compression. Herein are query execution mechanisms and techniques that intelligently avoid accessing circumstantially extraneous data and/or structures and that can horizontally scale.
-
6.
公开(公告)号:US11288108B2
公开(公告)日:2022-03-29
申请号:US16701797
申请日:2019-12-03
Applicant: ORACLE INTERNATIONAL CORPORATION
Inventor: Alberto Parravicini , Davide Bartolini , Lukas Stadler , Arnaud Delamare
Abstract: Techniques are provided for an automated method of adding out-of-bound access prevention in GPU kernels executed in a managed environment. In an embodiment, a system of computers compiles a GPU kernel code function that includes one or more array references that are memory address dependent. The system of computers compiles the kernel code function by generating a rewritten GPU kernel code module that includes, within the function signature of the rewritten GPU kernel code module, a respective array size parameter for each array reference of the one or more array references included in the GPU kernel code function. The system of computers further compiles the kernel code function by adding bounding protection instructions to the one or more potential out-of-bound access instructions in the rewritten GPU kernel code module. The potential out-of-bound access instructions comprise instructions that reference each respective array size parameter of the one or more array references. Afterwards, the rewritten GPU kernel code module is loaded in a virtual machine. Loading the rewritten GPU kernel code module in the virtual machine comprises modifying a host application to automatically transmit, from the host application, one or more input array size values. The one or more input array size values is referenced by the one or more potential out-of-bound-access instructions.
-
公开(公告)号:US11205050B2
公开(公告)日:2021-12-21
申请号:US16179049
申请日:2018-11-02
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Sungpack Hong , Jinha Kim , Damien Hilloulin , Davide Bartolini , Hassan Chafi
Abstract: Techniques are described herein for learning property graph representations edge-by-edge. In an embodiment, an input graph is received. The input graph comprises a plurality of vertices and a plurality of edges. Each vertex of the plurality of vertices is associated with vertex properties of the respective vertex. A vertex-to-property mapping is generated for each vertex of the plurality of vertices. The mapping maps each vertex to a vertex-property signature of a plurality of vertex-property signatures. A plurality of edge words is generated. Each edge word corresponds to one or more edges that each begin at a first vertex having a particular vertex-property signature of the plurality of vertex property signatures and end at a second vertex having a particular vertex-property signature of the plurality of vertex property signatures. A plurality of sentences is generated. Each sentence comprises edge words directly connected along a path of a plurality of paths in the input graph. Using the plurality of sentences and the plurality of edge words, a document vectorization model is used to generate machine learning vectors that represent the input graph.
-
公开(公告)号:US20210142008A1
公开(公告)日:2021-05-13
申请号:US17153078
申请日:2021-01-20
Applicant: Oracle International Corporation
Inventor: Rhicheek Patra , Davide Bartolini , Sungpack Hong , Hassan Chafi , Alberto Parravicini
IPC: G06F40/295 , G06N5/02
Abstract: According to an embodiment, a method includes converting a knowledge base into a graph. In this embodiment, the knowledge base contains a plurality of entities and specifies a plurality of relationships among the plurality of entities, and entities in the knowledge base correspond to vertices in the graph, and relationships between entities in the knowledge base correspond to edges between vertices in the graph. The method may also include extracting a plurality of vertex embeddings from the graph. An example vertex embedding of the plurality of vertex embeddings represents, for a particular vertex, a proximity of the particular vertex to other vertices of the graph. Further, the method may include performing, based at least in part on the plurality of vertex embeddings, entity linking between input text and the knowledge base.
-
公开(公告)号:US20200257982A1
公开(公告)日:2020-08-13
申请号:US16270535
申请日:2019-02-07
Applicant: Oracle International Corporation
Inventor: Jinha Kim , Rhicheek Patra , Sungpack Hong , Damien Hilloulin , Davide Bartolini , Hassan Chafi
Abstract: Techniques are described herein for encoding categorical features of property graphs by vertex proximity. In an embodiment, an input graph is received. The input graph comprises a plurality of vertices, each vertex of said plurality of vertices is associated with vertex properties of said vertex. The vertex properties include at least one categorical feature value of one or more potential categorical feature values. For each of the one or more potential categorical feature values of each vertex, a numerical feature value is generated. The numerical feature value represents a proximity of the respective vertex to other vertices of the plurality of vertices that have a categorical feature value corresponding to the respective potential categorical feature value. Using the numerical feature values for each vertex, proximity encoding data is generated representing said input graph. The proximity encoding data is used to efficiently train machine learning models that produce results with enhanced accuracy.
-
-
-
-
-
-
-
-