Abstract:
Techniques support graph pattern matching queries inside a relational database management system (RDBMS) that supports SQL execution. The techniques compile a graph pattern matching query that includes a bounded recursive pattern query into a SQL query that can then be executed by the relational engine. As a result, techniques enable execution of graph pattern matching queries that include bounded recursive patterns on top of the relational engine by avoiding any change in the existing SQL engine.
Abstract:
Techniques are described herein for early pruning of potential graph query results. Specifically, based on determining that property values of a path through graph data cannot affect results of a query, the path is pruned from a set of potential query solutions prior to fully exploring the path. Early solution pruning is performed on prunable queries that project prunable functions including MIN, MAX, SUM, and DISTINCT, the results of which are not tied to a number of paths explored for query execution. A database system implements early solution pruning for a prunable query based on intermediate results maintained for the query during query execution. Specifically, when a system determines that property values of a given potential solution path cannot affect the query results reflected in intermediate results maintained for the query, the path is discarded from the set of possible query solutions without further exploration of the path.
Abstract:
Techniques to efficiently assign available workers to executing multiple graph queries concurrently on a distributed graph database are disclosed. The techniques comprise a runtime engine assigning multiple workers to executing portions of multiple graph queries, each worker in each assignment asynchronously executing a portion of a graph query within a parallel-while construct that includes return statements at different locations, and the runtime engine reassigning a worker to executing another portion of the same or a different graph query to optimize the overall performance of all workers.
Abstract:
Herein are techniques that extend a software system to embed new guest programing languages (GPLs) that interoperate in a transparent, modular, and configurable way. In embodiments, a computer inserts an implementation of a GPL into a deployment of the system. A command registers the GPL, define subroutines for the GPL, generates a guest virtual environment, and adds a binding of a dependency to a guest module. In an embodiment, a native programing language invokes a guest programing language to cause importing intra- or inter-language dependencies. An embodiment defines a guest object that is implemented in a first GPL and accessed from a second GPL. In an embodiment, dependencies are retrieved from a virtual file system having several alternative implementation mechanisms that include: an archive file or an actual file system, and a memory buffer or a column of a database table.
Abstract:
Herein are computerized techniques for deploying JavaScript and TypeScript stored procedures and user-defined functions into a database management system (DBMS). In an embodiment, a computer generates a SQL call specification for each subroutine of one or more subroutines encoded in a scripting language. The generating is based on a signature declaration of the subroutine. Each subroutine comprises a definition of a stored procedure or a user-defined function. The computer packages the definition and the SQL call specification of each subroutine into a single bundle file. The definition and the SQL call specification of each subroutine are deployed into a DBMS from the single bundle file. Eventually, the SQL call specification of at least one subroutine is invoked to execute the definition of the subroutine in the DBMS.
Abstract:
A storage manager maintains metadata for a plurality of graph components including, for each given graph component, a memory-state indicator that indicates whether the given graph component is stored in memory. The storage manager identifies a set of graph components required to execute a graph processing operation and identifies, based on the metadata, a first subset of the set of graph components that are stored in the memory and a second subset of the set of graph components that are not stored in the memory. The storage manager loads the second subset of graph components into memory and initiates execution of the graph processing operation using the set of graph components in memory.
Abstract:
A graph rebalancing approach is provided that allows a distributed graph system to effectively support elasticity by incrementally balancing distributed in-memory graphs uniformly or in a custom manner on a set of given machines. Performing the incremental rebalancing operation comprises selecting a chunk in a source machine in the cluster having a surplus of chunks, selecting a target machine in the cluster having a deficit of chunks, transferring the selected chunk from the source machine to the target machine, and updating metadata in each machine in the cluster to reflect a location of the graph data elements in the selected chunk in the target machine.
Abstract:
Herein are techniques that extend a software system to embed new guest programing languages (GPLs) that interoperate in a transparent, modular, and configurable way. In embodiments, a computer inserts an implementation of a GPL into a deployment of the system. A command registers the GPL, define subroutines for the GPL, generates a guest virtual environment, and adds a binding of a dependency to a guest module. In an embodiment, a native programing language invokes a guest programing language to cause importing intra- or inter-language dependencies. An embodiment defines a guest object that is implemented in a first GPL and accessed from a second GPL. In an embodiment, dependencies are retrieved from a virtual file system having several alternative implementation mechanisms that include: an archive file or an actual file system, and a memory buffer or a column of a database table.
Abstract:
A graph processing system that supports automatic data model conversion from Resource Framework Description (RDF) to Property Graph (PG) is provided. Rather than using a naive conversion approach that creates PG nodes and edges without properties, a set of conversion rules is evaluated to automatically convert RDF triples into PG nodes and edges with properties, as appropriate. Accordingly, the converted PG data takes full advantage of the PG format while advantageously avoiding the creation of extraneous nodes and edges, allowing queries on the PG data to be efficiently executed on any database supporting the PG data model. The plurality of rules categorize each triple into three different cases depending on whether or not the predicate is “rdf:type” and whether or not the object is a literal value, generating graph entities as appropriate for each case. Optionally, user defined rules may override the automatic rules.
Abstract:
A graph processing system is provided for executing scouting queries for improving query planning. A query planner creates a plurality of scouting queries, each scouting query in the plurality of scouting queries corresponding to a query plan for a graph query and having an associated confidence value. A graph processing system performs limited execution of the plurality of scouting queries and determines a metric value for each scouting query in the plurality of scouting queries based on execution of the scouting query. The system determines a score for each scouting query in the plurality of scouting queries based on its metric value and the confidence value of the corresponding query plan and selects a query plan based on the scores of the plurality of scouting queries. The system executes the graph query based on the selected query plan.