CATEGORICAL FEATURE ENCODING FOR PROPERTY GRAPHS BY VERTEX PROXIMITY

    公开(公告)号:US20200257982A1

    公开(公告)日:2020-08-13

    申请号:US16270535

    申请日:2019-02-07

    Abstract: Techniques are described herein for encoding categorical features of property graphs by vertex proximity. In an embodiment, an input graph is received. The input graph comprises a plurality of vertices, each vertex of said plurality of vertices is associated with vertex properties of said vertex. The vertex properties include at least one categorical feature value of one or more potential categorical feature values. For each of the one or more potential categorical feature values of each vertex, a numerical feature value is generated. The numerical feature value represents a proximity of the respective vertex to other vertices of the plurality of vertices that have a categorical feature value corresponding to the respective potential categorical feature value. Using the numerical feature values for each vertex, proximity encoding data is generated representing said input graph. The proximity encoding data is used to efficiently train machine learning models that produce results with enhanced accuracy.

    Methods of graph-type specialization and optimization in graph algorithm DSL compilation

    公开(公告)号:US10585945B2

    公开(公告)日:2020-03-10

    申请号:US15666310

    申请日:2017-08-01

    Abstract: Techniques herein generate, such as during compilation, polymorphic dispatch logic (PDL) to switch between specialized implementations of a polymorphic graph algorithm. In an embodiment, a computer detects, within source logic of a graph algorithm, that the algorithm processes an instance of a generic graph type. The computer generates several alternative implementations of the algorithm. Each implementation is specialized to process the graph instance as an instance of a respective graph subtype. The computer generates PDL that performs dynamic dispatch as follows. At runtime, the PDL receives a graph instance of the generic graph type. The PDL detects which particular graph subtype is the graph instance. The PDL then invokes whichever alternative implementation that is specialized to process the graph instance as an instance of the detected particular graph subtype. In embodiments, the source logic is expressed in a domain specific language (DSL), e.g. for analysis, traversal, or querying of graphs.

    EFFICIENT DATA DECODING USING RUNTIME SPECIALIZATION

    公开(公告)号:US20190377589A1

    公开(公告)日:2019-12-12

    申请号:US16006668

    申请日:2018-06-12

    Abstract: Computer-implemented techniques described herein provide efficient data decoding using runtime specialization. In an embodiment, a method comprises a virtual machine executing a body of code of a dynamically typed language, wherein executing the body of code includes: querying a relational database, and in response to the query, receiving table metadata indicating data types of one or more columns of a first table in the relational database. In response to receiving the table metadata: for a first column of the one or more columns, generating decoding machine code to decode the first column based on the data type of the first column, and executing the decoding machine code to decode the first column of the one or more columns.

    Constructing an in-memory representation of a graph

    公开(公告)号:US10055509B2

    公开(公告)日:2018-08-21

    申请号:US14680150

    申请日:2015-04-07

    CPC classification number: G06F16/9024 G06F16/2246 G06F2201/80

    Abstract: Techniques for efficiently loading graph data into memory are provided. A plurality of node ID lists are retrieved from storage. Each node ID list is ordered based on one or more order criteria, such as node ID, and is read into memory. A new list of node IDs is created in memory and is initially empty. From among the plurality of node ID lists, a particular node ID is selected based on the one or more order criteria, removed from the node ID list where the particular node ID originates, and added to the new list. This process of selecting, removing, and adding continues until no more than one node ID list exists, other than the new list. In this way, the retrieval of the plurality of node ID lists from storage may be performed in parallel while the selecting and adding are performed sequentially.

Patent Agency Ranking