Abstract:
A block-based storage system may implement reducing durability state for a data volume. A determination may be made that storage node replicating write requests for a data volume is unavailable. In response, subsequent write requests may be processed according to a reduced durability state for the data volume such that replication for the data volume may be disabled for the storage node. Write requests may then be completed at a fewer number of storage nodes prior to acknowledging the write request as complete. Durability state for the data volume may be increase in various embodiments. A storage node may be identified and replication operations may be performed to synchronize the current data volume at the storage node with a replica of the data volume maintained at the identified storage node.
Abstract:
A data storage system includes multiple head nodes and data storage sleds. The data storage sleds include multiple mass storage devices and a sled controller. Respective ones of the head nodes are configured to obtain credentials for accessing particular portions of the mass storage devices of the data storage sleds. A sled controller of a data storage sled determines whether a head node attempting to perform a write on a mass storage device of a data storage sled that includes the sled controller is presenting with the write request a valid credential for accessing the mass storage devices of the data storage sled. If the credentials are valid, the sled controller causes the write to be performed and if the credentials are invalid, the sled controller returns a message to the head node indicating that it has been fenced off from the mass storage device.
Abstract:
A block-based storage system may implement page cache write logging. Write requests for a data volume maintained at a storage node may be received at a storage node. A page cache for may be updated in accordance with the request. A log record describing the page cache update may be stored in a page cache write log maintained in a persistent storage device. Once the write request is performed in the page cache and recorded in a log record in the page cache write log, the write request may be acknowledged. Upon recovery from a system failure where data in the page cache is lost, log records in the page cache write log may be replayed to restore to the page cache a state of the page cache prior to the system failure.
Abstract:
A block-based storage system may implement reducing durability state for a data volume. A determination may be made that storage node replicating write requests for a data volume is unavailable. In response, subsequent write requests may be processed according to a reduced durability state for the data volume such that replication for the data volume may be disabled for the storage node. Write requests may then be completed at a fewer number of storage nodes prior to acknowledging the write request as complete. Durability state for the data volume may be increase in various embodiments. A storage node may be identified and replication operations may be performed to synchronize the current data volume at the storage node with a replica of the data volume maintained at the identified storage node.
Abstract:
A data storage system includes multiple data storage units and a zonal control plane. The zonal control plane assigns volumes to respective ones of the data storage units. The data storage units include multiple head nodes and data storage sleds. At least one of the head nodes implements a local control plane for the data storage unit. Also, the head nodes of each data storage unit are configured to service read and write requests directed to one or more volumes serviced by the data storage unit independent of the zonal control plane.
Abstract:
A data storage system includes multiple head nodes and multiple data storage sleds mounted in a rack. For a particular volume or volume partition one of the head nodes is designated as a primary head node for the volume or volume partition. The primary head node is configured to store data for the volume in a data storage of the primary head node and cause the data to be replicated to a secondary head node. The primary head node is also configured to cause the data for the volume to be stored in a plurality of respective mass storage devices each in different ones of the plurality of data storage sleds of the data storage system.
Abstract:
A data storage system includes a rack, multiple head nodes, multiple data storage sleds, and at least two networking devices. The at least two network devices are configured to implement at least two redundant networks within the data storage system. Also, each of the head nodes is assigned at least two network addresses for communication with the data storage sleds of the data storage system via the at least two networking devices. The data storage sleds each include multiple mass storage devices and a sled controller that is configured to couple with the at least two network switches. In some embodiments, the data storage system further includes redundant power systems within a rack in which the head nodes, the data storage sleds, and the at least two networking devices are mounted.
Abstract:
Write optimization for block-based storage performing snapshot operations may be implemented. Write requests for a particular data volume may be received for which a snapshot operation is in progress. A determination may be made as to whether a data chunk of the data volume modified as part of the write request has not yet been stored to a remote snapshot data store as part of the snapshot operation. For a data chunk that is to be modified and that has not yet been stored, the data chunk may be stored in a local in-memory volume snapshot buffer. Once the data chunk is stored in the in-memory volume snapshot buffer, the write request may be performed and acknowledged as complete. The data chunk may be sent to the remote snapshot data store asynchronously with regard to the acknowledgment of the write request.