Abstract:
A log-structured data store may implement efficient garbage collection. Log records may be maintained in data blocks according to a log record sequence. Based, at least in part, on a log reclamation point, the log records may be evaluated to identify data blocks to reclaim that have log records in the log sequence prior to the log reclamation point. New versions of data pages updated by log records in the identified data blocks may be generated and stored in base page storage for the log structured data store. The identified data blocks may then be reclaimed for storing new data.
Abstract:
Self-describing data blocks of a minimum atomic write size may be stored for a data store. Data may be received for storage in a data block of a plurality of data blocks at a persistent storage device that are equivalent to a minimum atomic write size for the persistent storage device. Metadata may be generated for the data that includes an error detection code which is generated for the data and the metadata together. The data and the metadata are sent to the persistent storage device to store together in the data block. An individual atomic write operation may write together the data and the metadata in the data block. When accessed, the error detection code is applicable to detect errors. The metadata may also be applicable to determine whether the data is stored for a currently assigned purpose or a previously assigned purpose of the data block.
Abstract:
A database system may include a database service and a separate distributed storage service. The database service (or a database engine head node thereof) may be responsible for query parsing, optimization, and execution, transactionality, and consistency, while the storage service may be responsible for generating data pages from redo log records and for durability of those data pages. For example, in response to a write request directed to a particular data page, the database engine head node may generate a redo log record and send it, but not the data page, to a storage service node. The storage service node may store the redo log record and return a write acknowledgement to the database service prior to applying the redo log record. The server node may apply the redo log record and other redo log records to a previously stored version of the data page to create a current version.
Abstract:
Self-describing data blocks of a minimum atomic write size may be stored for a data store. Data may be received for storage in a data block of a plurality of data blocks at a persistent storage device that are equivalent to a minimum atomic write size for the persistent storage device. Metadata may be generated for the data that includes an error detection code which is generated for the data and the metadata together. The data and the metadata are sent to the persistent storage to device to store together in the data block. An individual atomic write operation may write together the data and the metadata in the data block. When accessed, the error detection code is applicable to detect errors. The metadata may also be applicable to determine whether the data is stored for a currently assigned purpose or a previously assigned purpose of the data block.
Abstract:
A log-structured data store may implement optimized log storage for asynchronous log updates. In some embodiments, log records may be received indicating updates to data stored for a storage client and indicating positions in a log record sequence. The log records themselves may not be guaranteed to be received according to the log record sequence. Received log records may be stored in a hot log portion of a block-based storage device according to an order in which they are received. Log records in the hot log portion may then be identified to be moved to a cold log portion of the block-based storage device in order to complete a next portion of the log record sequence. Log records may be modified, such as compressed, or coalesced, before being stored together in a data block of the cold log portion according to the log record sequence.
Abstract:
A database system may include a database service and a separate distributed storage service. The database service (or a database engine head node thereof) may be responsible for query parsing, optimization, and execution, transactionality, and consistency, while the storage service may be responsible for generating data pages from redo log records and for durability of those data pages. For example, in response to a write request directed to a particular data page, the database engine head node may generate a redo log record and send it, but not the data page, to a storage service node. The storage service node may store the redo log record and return a write acknowledgement to the database service prior to applying the redo log record. The server node may apply the redo log record and other redo log records to a previously stored version of the data page to create a current version.