Abstract:
The present disclosure generally relates to creating virtualized block storage devices whose data is replicated across isolated computing systems to lower risk of data loss even in wide-scale events, such as natural disasters. The virtualized device can include at least two volumes, each of which is implemented in a distinct computing system. Each volume can be implemented by at least two computing devices, a first of which is configured as a primary device to which reads from and writes to the volume are directed. Of the two volumes, one can be indicated as primary, indicating authority to accept reads to and writes from the virtualized device. A primary device of the primary volume, on obtaining a write to the volume, can replicate the write to both a secondary device of a primary volume and to the secondary volume.
Abstract:
A pricing policy to be applied to token population changes at a token bucket used for admission control during burst-mode operations at a work target is determined. Over a time period, changes to the token population of that bucket are recorded. An amount to be charged to a client is determined, based on the recorded changes in token population and an associated pricing amount indicated in the policy.
Abstract:
Methods and apparatus for equitable distribution of excess shared-resource throughput capacity are disclosed. A first and a second work target are configured to access a shared resource to implement accepted work requests. Admission control is managed at the work targets using respective token buckets. A first metric indicative of the work request arrival rates at the work targets during a time interval, and a second metric associated with the provisioned capacities of the work targets are determined. A number of tokens determined based on a throughput limit of the shared resource is distributed among the work targets to be used for admission control during a subsequent time interval. The number of tokens distributed to each work target is based on the first metric and/or the second metric.
Abstract:
At a client-side component of a storage group, a read descriptor generated in response to a read request directed to a first data store is received. The read descriptor includes a state transition indicator corresponding to a write that has been applied at the first data store. A write descriptor indicative of a write that depends on a result of the read request is generated at the client-side component. The read descriptor and the write descriptor are included in a commit request for a candidate transaction at the client-side component, and transmitted to a transaction manager.
Abstract:
Distributed database management systems may perform range queries over the leading portion of a primary key. Non-random distribution of data may improve performance related to the processing of range queries, but may tend to cause workload to be concentrated on particular partitions. Groups of partitions may be expanded and collapsed based on detection of disproportionate workload. Disproportionate write workload may be distributed among a group of partitions that can subsequently be queried using a federated approach. Disproportionate read workload may be distributed among a group of read-only replicated partitions.
Abstract:
Methods and apparatus for burst-mode admission control using token buckets are disclosed. A work request (such as a read or a write) directed to a work target is received. Based on a first criterion, a determination is made that the work target is in a burst mode of operation. A token population of a burst-mode token bucket is determined, and if the population meets a second criterion, the work request is accepted for execution.
Abstract:
A system that implements distributed storage may schedule and track control plane operations for performance at the distributed storage service. Information may be maintained for control plane events detected at a distributed storage system. Resource utilization for currently performing control plane operations and currently scheduled control plane operations of the distributed storage system may be determined. The information about detected control plane events may be analyzed to schedule control plane operations to be performed in response to detecting the control plane events. As part of scheduling control plane operations, resource constraints may be applied to the determine resource utilization for the distributed storage system.
Abstract:
Methods and apparatus for token-sharing mechanisms for burst-mode operations are disclosed. A first and a second token bucket are respectively configured for admission control at a first and a second work target. A number of tokens to be transferred between the first bucket and the second bucket, as well as the direction of the transfer, are determined, for example based on messages exchanged between the work targets. The token transfer is initiated, and admission control decisions at the work targets are made based on the token population resulting from the transfer.
Abstract:
A data analytics system may receive query definitions from which relationships between datasets may be identified. The query definitions may be analyzed to determine estimated costs and frequencies of combining a first and second dataset. Based on the cost and frequency, a combined dataset may be generated to by joining data from the first and second datasets. The combined dataset may be stored. Queries that comprise instructions to combine the first and second datasets may be processed by instead accessing the combined dataset.
Abstract:
Generally described, aspects of the present application correspond to enabling rapid duplication of data within a data volume hosted on a network storage system. The network storage system can maintain a highly distributed replica of the data volume, designated for duplication of data within the volume and separate from one or more other replicas designated for handling modifications to the data volume. By providing increased parallelization, the highly distributed replica can facilitate rapid duplication of the volume. When a sufficiently large request to duplicate the data volume is received, the system can create additional duplicate portions of the volume to further increase parallelization. For example, a partition of the highly distributed replica may be repeatedly duplicated to create a large number of intermediary duplicate partitions. The intermediary duplicate partitions can then be used to service the duplication request rapidly, due to increased parallelism.