Abstract:
Methods and apparatus for equitable distribution of excess shared-resource throughput capacity are disclosed. A first and a second work target are configured to access a shared resource to implement accepted work requests. Admission control is managed at the work targets using respective token buckets. A first metric indicative of the work request arrival rates at the work targets during a time interval, and a second metric associated with the provisioned capacities of the work targets are determined. A number of tokens determined based on a throughput limit of the shared resource is distributed among the work targets to be used for admission control during a subsequent time interval. The number of tokens distributed to each work target is based on the first metric and/or the second metric.
Abstract:
A system that implements a scalable data storage service may maintain tables in a data store on behalf of storage service clients. The service may maintain table data in multiple replicas of partitions that are stored on respective computing nodes in the system. In response to detecting an anomaly in the system, detecting a change in data volume on a partition or service request traffic directed to a partition, or receiving a service request from a client to split a partition, the data storage service may create additional copies of a partition replica using a physical copy mechanism. The data storage service may issue a split command defined in an API for the data store to divide the original and additional replicas into multiple replica groups, and to configure each replica group to maintain a respective portion of the table data that was stored in the partition before the split.
Abstract:
Methods and apparatus for token-based pricing policies for burst-mode operations are disclosed. A pricing policy to be applied to token population changes at a token bucket used for admission control during burst-mode operations at a work target is determined. Over a time period, changes to the token population of that bucket are recorded. A billing amount to be charged to a client is determined, based on the recorded changes in token population and an associated pricing amount indicated in the pricing policy.
Abstract:
Methods and apparatus for resource silos at network-accessible services are disclosed. A subset of resources used for a database service, including at least one resource from each of a plurality of data centers, is selected for membership in a resource silo based on grouping criteria. A silo routing layer node identifies the resource silo as the target silo to which a client work request is to be directed. The client work request is sent to a front-end resource of the target silo either by the client, or by the silo routing layer node on behalf of the client. The front-end resource of the target silo transmits a representation of the work request to a back-end resource of the target silo, where a work operation corresponding to request is performed.
Abstract:
Methods and apparatus for compound token buckets usable for burst-mode admission control are disclosed. A peak burst rate and a sustained burst rate of work requests that are to be supported at a work target are determined. The maximum token populations of a peak-burst token bucket and a sustained-burst token bucket are configured, based on the peak burst rate and the sustained burst rate respectively. In response to receiving a work request directed at the work target, a determination to accept the work request for execution is made based at least in part on the token population of the peak-burst token bucket and/or the sustained-burst token bucket.
Abstract:
Methods and apparatus for token-based admission control for replicated writes are disclosed. Data objects are divided into partitions, and corresponding to each partition, at least a master replica and a slave replica are stored. A determination as to whether to accept a write request directed to the partition is made based at least in part on one or more of (a) available throughput capacity at the master replica, and (b) an indication, obtained using a token-based protocol, of available throughput capacity at the slave replica. If the write request is accepted, one or more data modification operations are initiated.
Abstract:
A system that implements a data storage service may maintain tables in a data store on behalf of clients. The service may maintain table data in multiple replicas of partitions of the data that are stored on respective computing nodes in the system. In response to detecting a failure or fault condition, or receiving a service request from a client to move or copy a partition replica, the data store may copy a partition replica to another computing node using a physical copy mechanism. The physical copy mechanism may copy table data from physical storage locations in which it is stored to physical storage locations allocated to a destination replica on the other computing node. During copying, service requests to modify table data may be logged and applied to the replica being copied. A catch-up operation may be performed to apply modification requests received during copying to the destination replica.
Abstract:
A system that implements a scaleable data storage service may maintain tables in a non-relational data store on behalf of clients. Each table may include multiple items. Each item may include one or more attributes, each containing a name-value pair. Attribute values may be scalars or sets of numbers or strings. The system may provide an API usable to request that values of one or more of an item's attributes be updated. An update request may be conditional on expected values of one or more item attributes (e.g., the same or different item attributes). In response to a request to update the values of one or more item attributes, the previous values and/or updated values may be optionally returned for the updated item attributes or for all attributes of an item targeted by an update request. Items stored in tables may be indexed using a simple or composite primary key.
Abstract:
A system that implements a data storage service may store data in multiple replicated partitions on respective storage nodes. The selection of the storage nodes (or storage devices thereof) on which to store the partition replicas may be performed by administrative components that are responsible for partition management and resource allocation for respective groups of storage nodes (e.g., based on a global view of resource capacity or usage), or the selection of particular storage devices of a storage node may be determined by the storage node itself (e.g., based on a local view of resource capacity or usage). Placement policies applied at the administrative layer or storage layer may be based on the percentage or amount of provisioned, reserved, or available storage or IOPS capacity on each storage device, and particular placements (or subsequent operations to move partition replicas) may result in an overall resource utilization that is well balanced.
Abstract:
A system that provides services to clients may receive and service requests, various ones of which may require different amounts of work. An admission control mechanism may manage requests based on tokens, each of which represents a fixed amount of work. The tokens may be added to a token bucket at rate that is dependent on a target work throughput rate while the number of tokens in the bucket does not exceed its maximum capacity. If at least a pre-determined minimum number of tokens is present in the bucket when a service request is received, it may be serviced. Servicing a request may include deducting an initial number of tokens from the bucket, determining that the amount of work performed in servicing the request is different than that represented by the initially deducted tokens, and deducting additional tokens from or replacing tokens in the bucket to reflect the difference.