Abstract:
Techniques described and suggested herein include systems and methods for storing, indexing, and retrieving original data of data archives on data storage systems using redundancy coding techniques. For example, redundancy codes, such as erasure codes, may be applied to archives (such as those received from a customer of a computing resource service provider) so as allow the storage of original data of the individual archives available on a minimum of volumes, such as those of a data storage system, while retaining availability, durability, and other guarantees imparted by the application of the redundancy code. Sparse indexing techniques may be implemented so as to reduce the footprint of indexes used to locate the original data, once stored.
Abstract:
Techniques described and suggested herein include distributed deletion request processing and verification. For example, incident to migration of original data from a first data store to a second data store, verifications and confirmations related to removing the original data from the first data store may be performed so as to ensure the integrity of the original data represented on the second data store prior to removing the actual original data on the first data store. In some embodiments, the verifications and confirmations performed in connection with a deletion request may be apportioned to multiple entities, each of which may not fully trust the others. As a result, in some embodiments, a given deletion request may only be fulfilled if all of the entities involved in the verification process individually provide authorization to execute the deletion request.
Abstract:
Non-blocking processing of federated transactions may be implemented for distributed data partitions. A transaction may be received that specifies keys at data nodes to lock in order to perform the transaction. Lock requests are generated and sent to the data nodes which identify sibling keys to be locked at other data nodes for the transaction. In response to receiving the lock requests, data nodes may send to lock queues indicating other lock requests for the keys at the data node. An evaluation of the lock queues based, at least in part, on an ordering of the lock requests in the lock queues may be performed to identify a particular transaction to commit. Once identified, a request to commit the identified transaction may be sent to the particular data nodes indicated by the sibling keys in a lock request for the identified transaction.
Abstract:
A system and method for obtaining a request to perform a data operation with a volume, wherein the volume is a logical storage space in which data objects may be stored, determining a plurality of zones for performing the data operation with the volume, wherein each zone of the plurality of zones comprises a series of sectors of a computer-readable storage medium that forms an append-only section of the computer-readable storage medium, and performing the data operation with the volume on the plurality of zones.
Abstract:
Techniques described and suggested herein include systems and methods for storing, indexing, and retrieving original data of data archives on data storage systems using redundancy coding techniques. For example, redundancy codes, such as erasure codes, may be applied to archives (such as those received from a customer of a computing resource service provider) so as allow the storage of original data of the individual archives available on a minimum of volumes, such as those of a data storage system, while retaining availability, durability, and other guarantees imparted by the application of the redundancy code. Sparse indexing techniques may be implemented so as to reduce the footprint of indexes used to locate the original data, once stored. The volumes may be apportioned into failure-decorrelated subsets, and archives stored thereto may be apportioned to such subsets.
Abstract:
Techniques described and suggested herein include various methods and systems for verifying integrity of redundancy coded data, such as erasure coded data shards. In some embodiments, a quantity of redundancy coded data elements, hereafter referred to as data shards (e.g., erasure coded data shards), sufficient to reconstruct the original data element from which the redundancy coded data elements are derived, is used to generate reconstructed data shards to be used for checking the validity of analogous data shards stored for the original data element.
Abstract:
Techniques described and suggested herein include systems and methods for storing, indexing, and retrieving original data of data archives on data storage systems using redundancy coding techniques. For example, redundancy codes, such as erasure codes, may be applied to archives (such as those received from a customer of a computing resource service provider) so as allow the storage of original data of the individual archives available on a minimum of volumes, such as those of a data storage system, while retaining availability, durability, and other guarantees imparted by the application of the redundancy code. Sparse indexing techniques may be implemented so as to reduce the footprint of indexes used to locate the original data, once stored. The volumes may be apportioned into failure-decorrelated subsets, and archives stored thereto may be apportioned to such subsets.
Abstract:
Techniques described and suggested herein include systems and methods for precomputing regeneration information for data archives (“archives”) that have been processed and stored using redundancy coding techniques. For example, regeneration information, such as redundancy code-related matrices (such as inverted matrices based on, e.g., a generator matrix for the selected redundancy code) corresponding to subsets of the shards, is computed for each subset and, in some embodiments, stored for use in the event that one or more shards becomes unavailable, e.g., so as to more efficiently and/or quickly regenerate a replacement shard.
Abstract:
Techniques described and suggested herein include systems and methods for precomputing regeneration information for data archives (“archives”) that have been processed and stored using redundancy coding techniques. For example, regeneration information, such as redundancy code-related matrices (such as inverted matrices based on, e.g., a generator matrix for the selected redundancy code) corresponding to subsets of the shards, is computed for each subset and, in some embodiments, stored for use in the event that one or more shards becomes unavailable, e.g., so as to more efficiently and/or quickly regenerate a replacement shard.