Abstract:
Embodiments generate random walks through a directed graph that is represented in a relational database table. Each row of the graph table represents a directed edge in the graph and includes a source vertex and a destination vertex. Each row is further augmented to (a) indicate the number of outbound edges starting from the destination vertex in the row and (b) include an identifier that distinguishes the edge from other outbound edges starting from the same source vertex. An SQL query may be executed on the augmented graph table. Starting from a source vertex (starting vertex or the destination vertex of the previously selected hop) the query randomly selects a row of the graph table representing one of the outbound edges from the source vertex and adds the selected outbound edge as a row in a random walk table that represents the next hop in the random walk.
Abstract:
Embodiments generate random walks through a directed graph that is represented in a relational database table. Each row of the graph table represents a directed edge in the graph and includes a source vertex and a destination vertex. Each row is further augmented to (a) indicate the number of outbound edges starting from the destination vertex in the row and (b) include an identifier that distinguishes the edge from other outbound edges starting from the same source vertex. An SQL query may be executed on the augmented graph table. Starting from a source vertex (starting vertex or the destination vertex of the previously selected hop) the query randomly selects a row of the graph table representing one of the outbound edges from the source vertex and adds the selected outbound edge as a row in a random walk table that represents the next hop in the random walk.
Abstract:
Techniques are described for storing and maintaining, in a materialized view, bitmap data that represents a bitmap of each possible distinct value of an expression and rewriting a query for a count of distinct values of the expression using the materialized view. The materialized view contains bitmap data that represents a bitmap of each possible distinct value of a first expression, and aggregate values of additional expressions, and is stored in memory or on disk by a database system. The database system receives a query that requests a number of distinct values, of the first expression, and an aggregate value for an additional expression. In response, the database system, rewrites the query to: compute the number of distinct values by counting the bits in the bitmap data of the materialized view that are set to the first value, and obtains the aggregate value for the additional expression in the materialized view.
Abstract:
A method and system for processing database queries containing aggregate functions. The query may specify fewer groups than there are processes available to process the queries. Further, the queries may target a set of rows and specify a sort-by key and a group-by key. The method and system further includes determining that the queries specify application of the aggregate function to each of a plurality of groups that may correspond to a plurality of distinct values of the group-by key and determining that plurality of processes are available to process the queries. The method and system also includes determining the plurality of ranges of a composite key that may be formed by combining the group-by key and the sort-by key and assigning each range of the plurality ranges to a corresponding process to calculate the aggregate function.
Abstract:
Techniques are provided herein for processing a query using in-memory cursor duration temporary tables. The techniques involve storing a part of the temporary table in memory of nodes in a database cluster. A part of the temporary table may be stored in disk segments of nodes in the database cluster. Writer threads running on a particular node writes data for the temporary table to the memory of the particular node. Excess data may be written to the disk segment of the particular node. Reader threads running on the particular node reads data for the temporary table from the memory of the particular node and the disk segment of the particular node.
Abstract:
Techniques are provided for generating a “dimensional zonemap” that allows a database server to avoid scanning disk blocks of a fact table based on filter predicates in a query that qualify one or more dimension tables. The zonemap divides the fact table into sets of contiguous disk blocks referred to as “zones”. For each zone, a minimum value and a maximum value for each of one or more “zoned” columns of the dimension tables is determined and maintained in the zonemap. For a query that contains a filter predicate on a zoned column, the predicate value can be compared to the minimum value and maximum value maintained for a zone for that zoned column to determine whether a scan of the disk blocks of the zone can be skipped.
Abstract:
A method and one or more non-transitory storage media for materialized view-based query rewrite are provided. A plurality of materialized views is created based on a fact table and a dimension table. Each materialized view is based on a join between the dimension table and the fact table based on a respective foreign key column of the fact table. A database management system executes a query against the fact table and the dimension table, the query requiring one or more joins between the dimension table and the fact table based on one or more foreign key columns. For each given join of the one or more joins, responsive to the given join satisfying one or more rewrite criteria, the query is rewritten to replace the join between the dimension table and the fact table with a join between a respective materialized view for the given join and the fact table.
Abstract:
In one technique, a definition of a materialized view is identified. Based on the definition, multiple candidate partitioning schemes are identified. A query is generated that indicates one or more of the candidate partitioning schemes. The query is then executed, where executing the query results in one or more partition counts, each corresponding to a different candidate partitioning scheme of the one or more candidate partitioning schemes. Based on the one or more partition counts, a candidate partitioning scheme is selected from among the plurality of candidate partitioning schemes. The materialized view is automatically partitioned based on the candidate partitioning scheme.
Abstract:
Disclosed herein are techniques for storing, within a database system, metadata that indicates an intended usage (IU). Once created, an IU may be assigned to a column to (a) indicate how the column is intended to be used, and (b) affect how the database server behaves when database operations involve values from the column. The IU assigned to a column supplements, but does not replace, the datatype definition for the column. Each IU may have an IU-bundle. The IU-bundle of an IU indicates how the database server behaves with respect to any column that is assigned the IU. For example, the IU-bundle may indicate constraints that the database server must validate during operations on values from columns assigned to the IU. Techniques are also described for implementing multi-column IUs and flexible IUs.
Abstract:
Techniques are provided for optimizing storage of database records in segments using sub-segments. A base segment is a container used for storing records that belong to a database object. A database management system receives a request to load, into the database object, a first set of records that are in a first state. In response to receiving the request, the system generates a new sub-segment, which is a container that is separate from the base segment. The system stores the first set of records, in their first state, within the sub-segment. The system then monitors one or more characteristics of the database system. In response to the one or more characteristics satisfying criteria, the system performs a migration of one or more records of the first set of records from the sub-segment to the base segment. During the migration, the system converts the one or more records from the first state to a second state and stores the one or more records, in their second state, in the base segment.