Named entity disambiguation using entity distance in a knowledge graph

    公开(公告)号:US10902203B2

    公开(公告)日:2021-01-26

    申请号:US16392386

    申请日:2019-04-23

    Abstract: Techniques are described herein for performing named entity disambiguation. According to an embodiment, a method includes receiving input text, extracting a first mention and a second mention from the input text, and selecting, from a knowledge graph, a plurality of first candidate vertices for the first mention and a plurality of second candidate vertices for the second mention. The present method also includes evaluating a score function that analyzes vertex embedding similarity between the plurality of first candidate vertices and the plurality of second candidate vertices. In response to evaluating and seeking to optimize the score function, the method performs selecting a first selected candidate vertex from the plurality of first candidate vertices and a second selected candidate vertex from the plurality of second candidate vertices. Further, the present method includes mapping a first entry from the knowledge graph to the first mention and mapping a second entry from the knowledge graph to the second mention. In this embodiment, the first entry corresponds to the first selected candidate vertex and the second entry corresponds to the second selected candidate.

    Efficient, in-memory, relational representation for heterogeneous graphs

    公开(公告)号:US11120082B2

    公开(公告)日:2021-09-14

    申请号:US15956115

    申请日:2018-04-18

    Abstract: Techniques are provided herein for efficient representation of heterogeneous graphs in memory. In an embodiment, vertices and edges of the graph are segregated by type. Each property of a type of vertex or edge has values stored in a respective vector. Directed or undirected edges of a same type are stored in compressed sparse row (CSR) format. The CSR format is more or less repeated for edge traversal in either forward or reverse direction. An edge map translates edge offsets obtained from traversal in the reverse direction for use with data structures that expect edge offsets in the forward direction. Subsequent filtration and/or traversal by type or property of vertex or edge entails minimal data access and maximal data locality, thereby increasing efficient use of the graph.

    NAMED ENTITY DISAMBIGUATION USING ENTITY DISTANCE IN A KNOWLEDGE GRAPH

    公开(公告)号:US20200342055A1

    公开(公告)日:2020-10-29

    申请号:US16392386

    申请日:2019-04-23

    Abstract: Techniques are described herein for performing named entity disambiguation. According to an embodiment, a method includes receiving input text, extracting a first mention and a second mention from the input text, and selecting, from a knowledge graph, a plurality of first candidate vertices for the first mention and a plurality of second candidate vertices for the second mention. The present method also includes evaluating a score function that analyzes vertex embedding similarity between the plurality of first candidate vertices and the plurality of second candidate vertices. In response to evaluating and seeking to optimize the score function, the method performs selecting a first selected candidate vertex from the plurality of first candidate vertices and a second selected candidate vertex from the plurality of second candidate vertices. Further, the present method includes mapping a first entry from the knowledge graph to the first mention and mapping a second entry from the knowledge graph to the second mention. In this embodiment, the first entry corresponds to the first selected candidate vertex and the second entry corresponds to the second selected candidate.

    Named entity disambiguation using entity distance in a knowledge graph

    公开(公告)号:US11526673B2

    公开(公告)日:2022-12-13

    申请号:US17153078

    申请日:2021-01-20

    Abstract: According to an embodiment, a method includes converting a knowledge base into a graph. In this embodiment, the knowledge base contains a plurality of entities and specifies a plurality of relationships among the plurality of entities, and entities in the knowledge base correspond to vertices in the graph, and relationships between entities in the knowledge base correspond to edges between vertices in the graph. The method may also include extracting a plurality of vertex embeddings from the graph. An example vertex embedding of the plurality of vertex embeddings represents, for a particular vertex, a proximity of the particular vertex to other vertices of the graph. Further, the method may include performing, based at least in part on the plurality of vertex embeddings, entity linking between input text and the knowledge base.

    EFFICIENT GRAPH QUERY EXECUTION ENGINE SUPPORTING GRAPHS WITH MULTIPLE VERTEX AND EDGE TYPES

    公开(公告)号:US20200265090A1

    公开(公告)日:2020-08-20

    申请号:US16280591

    申请日:2019-02-20

    Abstract: Herein are computerized techniques for processing a heterogeneous graph according to scan-avoidant query planning. In an embodiment, a computer respectively stores a first and second kind of vertices of a property graph into a first and second vertex tables. The computer generates, without scanning the second vertex table: a) an initial partial result of a query of the property graph based on the first vertex table, and b) a subsequent partial result of the query based on the initial partial result and the second kind of vertices. Herein are graph encodings that are dense, without requiring extra computation, and that exploit graph heterogeneity to achieve an aggregation granularity that reduces data working set scope, optimizes for caching, and encourages compression. Herein are query execution mechanisms and techniques that intelligently avoid accessing circumstantially extraneous data and/or structures and that can horizontally scale.

    Automatic out-of-bound access prevention in GPU kernels executed in a managed environment

    公开(公告)号:US11288108B2

    公开(公告)日:2022-03-29

    申请号:US16701797

    申请日:2019-12-03

    Abstract: Techniques are provided for an automated method of adding out-of-bound access prevention in GPU kernels executed in a managed environment. In an embodiment, a system of computers compiles a GPU kernel code function that includes one or more array references that are memory address dependent. The system of computers compiles the kernel code function by generating a rewritten GPU kernel code module that includes, within the function signature of the rewritten GPU kernel code module, a respective array size parameter for each array reference of the one or more array references included in the GPU kernel code function. The system of computers further compiles the kernel code function by adding bounding protection instructions to the one or more potential out-of-bound access instructions in the rewritten GPU kernel code module. The potential out-of-bound access instructions comprise instructions that reference each respective array size parameter of the one or more array references. Afterwards, the rewritten GPU kernel code module is loaded in a virtual machine. Loading the rewritten GPU kernel code module in the virtual machine comprises modifying a host application to automatically transmit, from the host application, one or more input array size values. The one or more input array size values is referenced by the one or more potential out-of-bound-access instructions.

    Learning property graph representations edge-by-edge

    公开(公告)号:US11205050B2

    公开(公告)日:2021-12-21

    申请号:US16179049

    申请日:2018-11-02

    Abstract: Techniques are described herein for learning property graph representations edge-by-edge. In an embodiment, an input graph is received. The input graph comprises a plurality of vertices and a plurality of edges. Each vertex of the plurality of vertices is associated with vertex properties of the respective vertex. A vertex-to-property mapping is generated for each vertex of the plurality of vertices. The mapping maps each vertex to a vertex-property signature of a plurality of vertex-property signatures. A plurality of edge words is generated. Each edge word corresponds to one or more edges that each begin at a first vertex having a particular vertex-property signature of the plurality of vertex property signatures and end at a second vertex having a particular vertex-property signature of the plurality of vertex property signatures. A plurality of sentences is generated. Each sentence comprises edge words directly connected along a path of a plurality of paths in the input graph. Using the plurality of sentences and the plurality of edge words, a document vectorization model is used to generate machine learning vectors that represent the input graph.

    NAMED ENTITY DISAMBIGUATION USING ENTITY DISTANCE IN A KNOWLEDGE GRAPH

    公开(公告)号:US20210142008A1

    公开(公告)日:2021-05-13

    申请号:US17153078

    申请日:2021-01-20

    Abstract: According to an embodiment, a method includes converting a knowledge base into a graph. In this embodiment, the knowledge base contains a plurality of entities and specifies a plurality of relationships among the plurality of entities, and entities in the knowledge base correspond to vertices in the graph, and relationships between entities in the knowledge base correspond to edges between vertices in the graph. The method may also include extracting a plurality of vertex embeddings from the graph. An example vertex embedding of the plurality of vertex embeddings represents, for a particular vertex, a proximity of the particular vertex to other vertices of the graph. Further, the method may include performing, based at least in part on the plurality of vertex embeddings, entity linking between input text and the knowledge base.

    CATEGORICAL FEATURE ENCODING FOR PROPERTY GRAPHS BY VERTEX PROXIMITY

    公开(公告)号:US20200257982A1

    公开(公告)日:2020-08-13

    申请号:US16270535

    申请日:2019-02-07

    Abstract: Techniques are described herein for encoding categorical features of property graphs by vertex proximity. In an embodiment, an input graph is received. The input graph comprises a plurality of vertices, each vertex of said plurality of vertices is associated with vertex properties of said vertex. The vertex properties include at least one categorical feature value of one or more potential categorical feature values. For each of the one or more potential categorical feature values of each vertex, a numerical feature value is generated. The numerical feature value represents a proximity of the respective vertex to other vertices of the plurality of vertices that have a categorical feature value corresponding to the respective potential categorical feature value. Using the numerical feature values for each vertex, proximity encoding data is generated representing said input graph. The proximity encoding data is used to efficiently train machine learning models that produce results with enhanced accuracy.

Patent Agency Ranking