Abstract:
Techniques support graph pattern matching queries inside a relational database management system (RDBMS) that supports SQL execution. The techniques compile a graph pattern matching query into a SQL query that can then be executed by the relational engine. As a result, techniques enable execution of graph pattern matching queries on top of the relational engine by avoiding any change in the existing SQL engine.
Abstract:
Techniques are described herein for executing queries with a recursive clause using a plurality of processes that execute database operations in parallel. Each process of the plurality of processes either generate or are assigned a segment that is part of a temporary table. For each iteration of the recursive query, work granules are divided up and assigned to each of the plurality of processes. As each respective process produces a portion of a result set for a given iteration, that process appends said portion of the result set to the respective segment that the respective process manages. Each slave process then publishes, to one or more sources, a reference to the newly generated results. During the next iteration, any slave process may access any of the data from the previous iteration.
Abstract:
A method and system for processing database queries containing aggregate functions. The query may specify fewer groups than there are processes available to process the queries. Further, the queries may target a set of rows and specify a sort-by key and a group-by key. The method and system further includes determining that the queries specify application of the aggregate function to each of a plurality of groups that may correspond to a plurality of distinct values of the group-by key and determining that plurality of processes are available to process the queries. The method and system also includes determining the plurality of ranges of a composite key that may be formed by combining the group-by key and the sort-by key and assigning each range of the plurality ranges to a corresponding process to calculate the aggregate function.
Abstract:
Techniques are described herein for executing queries with a recursive clause using a plurality of processes that execute database operations in parallel. Each process of the plurality of processes either generate or are assigned a segment that is part of a temporary table. For each iteration of the recursive query, work granules are divided up and assigned to each of the plurality of processes. As each respective process produces a portion of a result set for a given iteration, that process appends said portion of the result set to the respective segment that the respective process manages. Each slave process then publishes, to one or more sources, a reference to the newly generated results. During the next iteration, any slave process may access any of the data from the previous iteration.
Abstract:
According to one aspect of the invention, for a database statement that specifies evaluating reporting window functions, a computation-pushdown execution strategy may be used for the database statement. The computation-pushdown execution plan includes producer operators and consolidation operators. Each producer operator computes a respective partial aggregation for each reporting window function based on a subset of rows, and broadcasts the respective partial aggregation. Each consolidation operator fully aggregates all partial aggregations broadcasted from the producer operators. Alternatively, an extended-data-distribution-key execution plan may be used. Each producer operator sends rows based on hash keys to sort operators for computing partial aggregations for at least one reporting window function based on a subset of rows. Each consolidation operator receives and fully aggregates all partial aggregations broadcasted from the sort operators.
Abstract:
Techniques are provided for generating a “dimensional zonemap” that allows a database server to avoid scanning disk blocks of a fact table based on filter predicates in a query that qualify one or more dimension tables. The zonemap divides the fact table into sets of contiguous disk blocks referred to as “zones”. For each zone, a minimum value and a maximum value for each of one or more “zoned” columns of the dimension tables is determined and maintained in the zonemap. For a query that contains a filter predicate on a zoned column, the predicate value can be compared to the minimum value and maximum value maintained for a zone for that zoned column to determine whether a scan of the disk blocks of the zone can be skipped.
Abstract:
According to one aspect of the invention, for a database statement that specifies evaluating reporting window functions, a computation-pushdown execution strategy may be used for the database statement. The computation-pushdown execution plan includes producer operators and consolidation operators. Each producer operator computes a respective partial aggregation for each reporting window function based on a subset of rows, and broadcasts the respective partial aggregation. Each consolidation operator fully aggregates all partial aggregations broadcasted from the producer operators. Alternatively, an extended-data-distribution-key execution plan may be used. Each producer operator sends rows based on hash keys to sort operators for computing partial aggregations for at least one reporting window function based on a subset of rows. Each consolidation operator receives and fully aggregates all partial aggregations broadcasted from the sort operators.
Abstract:
Techniques support graph pattern matching queries inside a relational database management system (RDBMS) that supports SQL execution. The techniques compile a graph pattern matching query into a SQL query that can then be executed by the relational engine. As a result, techniques enable execution of graph pattern matching queries on top of the relational engine by avoiding any change in the existing SQL engine.
Abstract:
Techniques for automatically partitioning materialized views are provided. In one technique, a definition of a materialized view is identified. Based on the definition, multiple candidate partitioning schemes are identified. A query is generated that indicates one or more of the candidate partitioning schemes. The query is then executed, where executing the query results in one or more partition counts, each corresponding to a different candidate partitioning scheme of the one or more candidate partitioning schemes. Based on the one or more partition counts, a candidate partitioning scheme is selected from among the plurality of candidate partitioning schemes. The materialized view is automatically partitioned based on the candidate partitioning scheme.
Abstract:
Techniques described herein allow a user of an RDBMS to specify a graph algorithm function (GAF) that takes a graph object as input and returns a logical graph object as output. GAFs are used within graph queries to compute temporary and output properties (“GAF-computed properties”), which are live for the duration of the query cursor execution. GAF-computed output properties are accessible in the enclosing graph pattern matching query as though they were part of the input graph object of the GAF. Temporary cursor-duration tables are generated for the query cursor during compilation of a graph query that includes a GAF, and are used to store the GAF-computed properties. Each temporary table corresponds to one of the primary tables of the input graph, and includes, as a foreign key, primary key information from the corresponding primary table. Thus, the input graph of a GAF may be a “heterogeneous” graph.