Abstract:
A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A method to decompress a run length encoded compressed bit-vector in a SIMD architecture is presented. A method to return the offset of each bit set to one in a bit-vector in a SIMD architecture is presented. A method to fetch bits from a bit-vector at specified offsets relative to a base in a SIMD architecture is presented. A method to compare values stored in two SIMD registers is presented.
Abstract:
A computer analyzes a relational schema of a database to generate a data entry schema and encodes the data entry schema as JSON. The data entry schema is sent to a database client so that the client can validate entered data before the entered data is sent for storage. From the client, entered data is received that conforms to the data entry schema because the client used the data entry schema to validate the entered data before sending the data. Into the database, the entered data is stored that conforms to the data entry schema. The data entry schema and the relational schema have corresponding constraints on a datum to be stored, such as a range limit for a database column or an express set of distinct valid values. A constraint may specify a format mask or regular expression that values in the column should conform to, or a correlation between values of multiple columns.
Abstract:
Techniques are described for offloading remote direct memory operations (RDMOs) to “execution candidates”. The execution candidates may be any hardware capable of performing the offloaded operation. Thus, the execution candidates may be network interface controllers, specialized co-processors, FPGAs, etc. The execution candidates may be on a machine that is remote from the processor that is offloading the operation, or may be on the same machine as the processor that is offloading the operation. Details for certain specific RDMOs, which are particularly useful in online transaction processing (OLTP) and hybrid transactional/analytical (HTAP) workloads, are provided.
Abstract:
Techniques are provided to allow more sophisticated operations to be performed remotely by machines that are not fully functional. Operations that can be performed reliably by a machine that has experienced a hardware and/or software error are referred to herein as Remote Direct Memory Operations or “RDMOs”. Unlike RDMAs, which typically involve trivially simple operations such as the retrieval of a single value from the memory of a remote machine, RDMOs may be arbitrarily complex. The techniques described herein can help applications run without interruption when there are software faults or glitches on a remote system with which they interact.
Abstract:
Techniques are provided to allow more sophisticated operations to be performed remotely by machines that are not fully functional. Operations that can be performed reliably by a machine that has experienced a hardware and/or software error are referred to herein as Remote Direct Memory Operations or “RDMOs”. Unlike RDMAs, which typically involve trivially simple operations such as the retrieval of a single value from the memory of a remote machine, RDMOs may be arbitrarily complex. The techniques described herein can help applications run without interruption when there are software faults or glitches on a remote system with which they interact.
Abstract:
Techniques are described herein for distributing distinct portions of a database object across volatile memories of selected nodes of a plurality of nodes in a clustered database system. The techniques involve storing a unit-to-service mapping that associates a unit (a database object or portion thereof) to one or more database services. The one or more database services are mapped to one or more nodes. The nodes to which a service is mapped may include nodes in disjoint database systems, so long as those database systems have access to a replica of the unit. The database object is treated as in-memory enabled by nodes that are associated with the service, and are treated as not in-memory enabled by nodes that are not associated with the service.
Abstract:
Techniques related to efficient data storage and retrieval using a heterogeneous main memory are disclosed. A database includes a set of persistent format (PF) data that is stored on persistent storage in a persistent format. The database is maintained on the persistent storage and is accessible to a database server. The database server converts the set of PF data to sets of mirror format (MF) data and stores the MF data in a hierarchy of random-access memories (RAMs). Each RAM in the hierarchy has an associated latency that is different from a latency associated with any other RAM in the hierarchy. Storing the sets of MF data in the hierarchy of RAMs includes (1) selecting, based on one or more criteria, a respective RAM in the hierarchy to store each set of MF data and (2) storing said each set of MF data in the respective RAM.
Abstract:
Techniques are provided for maintaining data persistently in one format, but making that data available to a database server in more than one format. For example, one of the formats in which the data is made available for query processing is based on the on-disk format, while another of the formats in which the data is made available for query processing is independent of the on-disk format. Data that is in the format that is independent of the disk format may be maintained exclusively in volatile memory to reduce the overhead associated with keeping the data in sync with the on-disk format copies of the data.
Abstract:
Techniques are provided for determining costs for alternative execution plans for a query, where at least a portion of the data items required by the query are in in-memory compression-units within volatile memory. The techniques involve maintaining in-memory statistics, such as statistics that indicate what fraction of a table is currently present in in-memory compression units, and the cost of decompressing in-memory compression units. Those statistics are used to determine, for example, the cost of a table scan that retrieves some or all of the necessary data items from the in-memory compression-units.
Abstract:
A method and apparatus for efficiently processing data in various formats in a single instruction multiple data (“SIMD”) architecture is presented. Specifically, a method to unpack a fixed-width bit values in a bit stream to a fixed width byte stream in a SIMD architecture is presented. A method to unpack variable-length byte packed values in a byte stream in a SIMD architecture is presented. A method to decompress a run length encoded compressed bit-vector in a SIMD architecture is presented. A method to return the offset of each bit set to one in a bit-vector in a SIMD architecture is presented. A method to fetch bits from a bit-vector at specified offsets relative to a base in a SIMD architecture is presented. A method to compare values stored in two SIMD registers is presented.