Abstract:
A database-aware storage server provides instant creation of snapshots without the need to create an intermediate test master database. During the snapshot creation time, the source database stays read-write and completes ongoing reads and writes. The database-aware storage server allows creation of layers of writable snapshots in a hierarchy. All these multiple databases share common data blocks. Any new writes performed by the database post snapshot are stored in blocks of sparse files. This promotes space sharing and reduces the total amount of space used by all these related databases. The allocations for the source and all new snapshot databases share the same common pool of storage. The newly created snapshot databases can access the data store directly without going through an intermediate layer.
Abstract:
Region summaries of database data are stored in persistent memory of a storage cell. Because the region summaries are stored in persistent memory, when a storage cell is powered off and data in volatile memory is not retained, region summaries are nevertheless preserved in persistent memory. When the storage cell comes online, the region summaries already exist and may be used without the delay attendant to regenerating the region summaries stored in volatile memory.
Abstract:
Techniques are described herein for generating and using in-memory data structures to represent columns in data block sets. In an embodiment, a database management system (DBMS) receives a query for a target data set managed by the DBMS. The query may specify a predicate for a column of the target data set. The predicate may include a filtering value to be compared with row values of the column of the target data set. Prior to accessing data block sets storing the target data set from persistent storage, the DBMS identifies an in-memory summary that corresponds to a data block set, in an embodiment. The in-memory summary may include in-memory data structures, each representing a column stored in the data block set. The DBMS determines that a particular in-memory data structure exists in the in-memory summary that represents a portion of values of the column indicated in the predicate of the query. Based on the particular in-memory data structure, the DBMS determines whether or not the data block set can possibly contain the filtering value in the column of the target data set. Based on this determination, the DBMS skips or retrieves the data block set from the persistent storage as part of the query evaluation.
Abstract:
According to embodiments, a derived cache that is derived from a first instance of particular data is used to speed up queries and other operations over a second instance of the particular data. Traditionally, a DBMS generates and uses derived cache data only for the database data from which the derived data was derived. However, according to embodiments, derived cache data associated with a first instance of database data is relocated to the location of a second, newly created, instance of the database data. Since the derived cache data is derived from an identical copy of the database data, the cache data derived for the first instance can successfully be used to speed up applications running over the second instance of the database data.
Abstract:
Techniques are described herein for supporting multiple versions of a database server within a database machine comprising a separate database layer and storage layer. In an embodiment, the database layer includes compute nodes each hosting one or more instances of a database server. The storage layer includes storage nodes each hosting one or more instances of a storage server, also referred to herein as a “cell server.” In general, the database servers may receive data requests, such as SQL queries, from client applications and service the requests in coordination with the cell servers of the storage layer.
Abstract:
Region summaries of database data are stored in persistent memory of a storage cell. Because the region summaries are stored in persistent memory, when a storage cell is powered off and data in volatile memory is not retained, region summaries are nevertheless preserved in persistent memory. When the storage cell comes online, the region summaries already exist and may be used without the delay attendant to regenerating the region summaries stored in volatile memory.
Abstract:
A shared storage architecture persistently stores database files in non-volatile random access memories (NVRAMs) of computing nodes of a multi-node DBMS. The computing nodes of the multi-node DBMS not only collectively store database data on NVRAMs of the computing nodes, but also host database server instances that process queries in parallel, host database sessions and database processes, and together manage access to a database stored on the NVRAMs of the computing nodes. To perform a data block read operation from persistent storage, a data block may be transferred directly over a network between NVRAM of a computing node that persistently stores the data block to a database buffer in non-volatile RAM of another computing node that requests the data block. The transfer is accomplished using remote direct memory access (“RDMA).
Abstract:
Techniques are described herein for generating and using in-memory data structures to represent columns in data block sets. In an embodiment, a database management system (DBMS) receives a query for a target data set managed by the DBMS. The query may specify a predicate for a column of the target data set. The predicate may include a filtering value to be compared with row values of the column of the target data set. Prior to accessing data block sets storing the target data set from persistent storage, the DBMS identifies an in-memory summary that corresponds to a data block set, in an embodiment. The in-memory summary may include in-memory data structures, each representing a column stored in the data block set. The DBMS determines that a particular in-memory data structure exists in the in-memory summary that represents a portion of values of the column indicated in the predicate of the query. Based on the particular in-memory data structure, the DBMS determines whether or not the data block set can possibly contain the filtering value in the column of the target data set. Based on this determination, the DBMS skips or retrieves the data block set from the persistent storage as part of the query evaluation.
Abstract:
Techniques for optimizing a query with an extrema function are provided. In main memory, a data summary is maintained for a plurality of extents stored by at least one storage server. The data summary includes an extent minimum value and an extent maximum value for one or more columns. A storage server request is received, from a database server, based on a query with an extrema function applied to a particular column of a particular table. The data summaries for a set of relevant extents are processed by maintaining at least one global extrema value corresponding to the extrema function and, for each relevant extent of the set of relevant extents, determining whether to scan records of the relevant extent based on at least one of the global extrema value and an extent summary value of the data summary of the relevant extent.
Abstract:
Region summaries of database data are stored in persistent memory of a storage cell. Because the region summaries are stored in persistent memory, when a storage cell is powered off and data in volatile memory is not retained, region summaries are nevertheless preserved in persistent memory. When the storage cell comes online, the region summaries already exist and may be used without the delay attendant to regenerating the region summaries stored in volatile memory.