Abstract:
Techniques related to cache storage formats are disclosed. In some embodiments, a set of values is stored in a cache as a set of first representations and a set of second representations. For example, the set of first representations may be a set of hardware-level representations, and the set of second representations may be a set of non-hardware-level representations. Responsive to receiving a query to be executed over the set of values, a determination is made as to whether or not it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations. If the determination indicates that it would be more efficient to execute the query over the set of first representations than to execute the query over the set of second representations, the query is executed over the set of first representations.
Abstract:
A shared-nothing database system is provided in which parallelism and workload balancing are increased by assigning the rows of each table to “slices”, and storing multiple copies (“duplicas”) of each slice across the persistent storage of multiple nodes of the shared-nothing database system. When the data for a table is distributed among the nodes of a shared-nothing system in this manner, requests to read data from a particular row of the table may be handled by any node that stores a duplica of the slice to which the row is assigned. For each slice, a single duplica of the slice is designated as the “primary duplica”. All DML operations (e.g. inserts, deletes, updates, etc.) that target a particular row of the table are performed by the node that has the primary duplica of the slice to which the particular row is assigned. The changes made by the DML operations are then propagated from the primary duplica to the other duplicas (“secondary duplicas”) of the same slice.
Abstract:
Techniques related to efficient evaluation of aggregate functions are disclosed. Computing device(s) may perform a method for aggregating results of performing a multiplication on a first column and a second column of a database table. A first vector stores a subset of values of the first column. A second vector stores a corresponding subset of values of the second column. When it is determined that the first vector has a lower cardinality than the second vector, a third vector stores at least a first distinct value and a second distinct value of the first vector. A first set of one or more values of the second vector is determined, wherein each value of the first set of one or more values corresponds to the first distinct value in the first vector. A first multiplicand is generated based on performing a summation over the first set of one or more values.
Abstract:
Techniques are described for materializing computations in memory. In an embodiment, responsive to a database server instance receiving a query, the database server instance identifies a set of computations for evaluation during execution of the query. Responsive to identifying the set of computations, the database server instance evaluates at least one computation in the set of computations to obtain a first set of computation results for a first computation in the set of computations. After evaluating the at least one computation, the database server instance stores, within an in-memory unit, the first set of computation results. The database server also stores mapping data that maps a set of metadata values associated with the first computation to the first set of computation results.
Abstract:
Techniques are provided for more efficiently using the bandwidth of the I/O path between a CPU and volatile memory during the performance of database operation. Relational data from a relational table is stored in volatile memory as column vectors, where each column vector contains values for a particular column of the table. A binary-comparable format may be used to represent each value within a column vector, regardless of the data type associated with the column. The column vectors may be compressed and/or encoded while in volatile memory, and decompressed/decoded on-the-fly within the CPU. Alternatively, the CPU may be designed to perform operations directly on the compressed and/or encoded column vector data. In addition, techniques are described that enable the CPU to perform vector processing operations on the column vector values.
Abstract:
Techniques related to efficient evaluation of queries with multiple predicate expressions are disclosed. A first predicate expression (PE) is evaluated against a plurality of rows in a first column vector (CV) to determine that a subset of rows does not satisfy the first PE. The subset comprises less than all of the plurality of rows. When a query specifies the first PE in conjunction with a second PE, a selectivity of the first PE is determined. If the selectivity meets a threshold, the second PE is evaluated against all of the plurality of rows in a second CV. If the selectivity does not meet the threshold, the second PE is evaluated against only the subset of rows in the second CV. When a query specifies the first PE in disjunction with the second PE, the second PE may be evaluated against only the subset of rows in the second CV.
Abstract:
Techniques related to efficient evaluation of query expressions including grouping clauses are disclosed. Computing device(s) perform a method for aggregating a measure column vector (MCV) according to a plurality of grouping column vectors (GCVs). The MCV and each of the plurality of GCVs may be encoded. The method includes determining a plurality of actual grouping keys (AGKs) and generating a dense identifier (DI) mapping that maps the plurality of AGKs to a plurality of DIs. Each AGK occurs in the plurality of GCVs. Each DI corresponds to a respective workspace. Aggregating the MCV involves aggregating, in each workspace, one or more codes of the MCV that correspond to an AGK mapped to a DI corresponding to the workspace. For a first row of the MCV and the plurality of GCVs, aggregating the one or more codes includes generating a particular grouping key based on codes in the plurality of GCVs.
Abstract:
A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A method to decompress a run length encoded compressed bit-vector in a SIMD architecture is presented. A method to return the offset of each bit set to one in a bit-vector in a SIMD architecture is presented. A method to fetch bits from a bit-vector at specified offsets relative to a base in a SIMD architecture is presented. A method to compare values stored in two SIMD registers is presented.
Abstract:
Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data.
Abstract:
Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data.