Abstract:
A non-relational data store may implement validating and non-validating secondary indexes for a table. Operations at a table for a given item may be performed when indexing the item to create a secondary index or when updates to the given item are received. Attribute values of a given item may be validated with respect to an indexing schema for the secondary index. For a non-validating secondary index, validation errors detected for the attribute values may be ignored so that the operation at the table may be performed. For a validating secondary index, validation errors detected for the attribute values may result in denying performance of the operation. In some embodiments, a secondary index from may be changed from validating to non-validating, or non-validating to validating.
Abstract:
A data storage system may implement locking item ranges for creating a secondary index of an online table. A secondary index may be generated for a table of items stored in a non-relational data store. Different ranges of items in the data store may be locked while a corresponding portion of the secondary index is generated. Upon generating the corresponding portion of the secondary index, a range of items may be unlocked. While generating the secondary index, the table may be made available for servicing access requests. For a request to update the table received during the generation of the secondary index, a determination may be made as to whether the update is included within a locked range of the table. If locked, the request may be delayed until the range is unlocked.
Abstract:
Methods and apparatus for equitable distribution of excess shared-resource throughput capacity are disclosed. A first and a second work target are configured to access a shared resource to implement accepted work requests. Admission control is managed at the work targets using respective token buckets. A first metric indicative of the work request arrival rates at the work targets during a time interval, and a second metric associated with the provisioned capacities of the work targets are determined. A number of tokens determined based on a throughput limit of the shared resource is distributed among the work targets to be used for admission control during a subsequent time interval. The number of tokens distributed to each work target is based on the first metric and/or the second metric.
Abstract:
Distributed database management systems may perform range queries over the leading portion of a primary key. Non-random distribution of data may improve performance related to the processing of range queries, but may tend to cause workload to be concentrated on particular partitions. Groups of partitions may be expanded and collapsed based on detection of disproportionate workload. Disproportionate write workload may be distributed among a group of partitions that can subsequently be queried using a federated approach. Disproportionate read workload may be distributed among a group of read-only replicated partitions.
Abstract:
Methods and apparatus for burst-mode admission control using token buckets are disclosed. A work request (such as a read or a write) directed to a work target is received. Based on a first criterion, a determination is made that the work target is in a burst mode of operation. A token population of a burst-mode token bucket is determined, and if the population meets a second criterion, the work request is accepted for execution.
Abstract:
Methods and apparatus for token-sharing mechanisms for burst-mode operations are disclosed. A first and a second token bucket are respectively configured for admission control at a first and a second work target. A number of tokens to be transferred between the first bucket and the second bucket, as well as the direction of the transfer, are determined, for example based on messages exchanged between the work targets. The token transfer is initiated, and admission control decisions at the work targets are made based on the token population resulting from the transfer.
Abstract:
A system that implements a scaleable data storage service may maintain tables in a non-relational data store on behalf of service clients. Each table may include multiple items. Each item may include one or more attributes, each containing a name-value pair. The system may provide an API through which clients can query tables maintained by the service. Items may be partitioned and indexed in a table according to a simple or composite primary key contained in all items in the table. A composite primary key may include a hash key attribute, and a range key attribute. The range key attribute may be usable to order items having the same hash key attribute value, and to partition them dependent on a range of range key attribute values. A query request may specify a logical or mathematical expression dependent on range key attribute values and may be directed to multiple partitions.
Abstract:
A system that implements a data storage service may store data for database tables in multiple replicated partitions on respective storage nodes. In response to a request to back up a table, the service may export individual partitions of the table from the database and package them to be independently uploaded (e.g., in parallel) to a remote storage system (e.g., a key-value durable storage system). Prior to uploading the exported and packaged partitions to the remote storage system, the service may verify that the exported and packaged partitions can be subsequently restored, which may include unpackaging and/or re-inflating the exported and packaged partitions to create additional unpackaged copies of the partitions, re-importing the additional unpackaged copies of the partitions into the database (e.g., as additional replicas), and/or comparing checksums generated for the exported partitions with checksums generated for the additional unpackaged copies of the partitions.
Abstract:
A system that implements a scaleable data storage service may maintain tables in a data store on behalf of storage service clients. The service may maintain data in partitions stored on respective computing nodes in the system. The service may support multiple throughput models, including a committed throughput model and a best effort throughput model. A service request to create a table may specify that requests directed to the table should be serviced under a committed throughput model and may specify the committed throughput level in terms of logical service request units. The service may reserve low-latency storage and other resources sufficient to meet the specified committed throughput level. A client/user may request a modification to the committed throughput level in anticipation of workload changes, such as an increase or decrease in traffic or data volume. In response, the system may increase or decrease the resources reserved for the table.
Abstract:
One or more table partitions may communicate with an index partition that may be a master of a replication group. A communications channel may exist between table partitions and the index partition. Upon splitting the index partition, communications between the table partitions and the index partition may be suspended. Upon completion of the split, communications may be reestablished between the table partitions and a partition, of the replication group of index partitions, designated to be a master following the split. Messages accumulated by the table partitions during the split may be sent to the index partition upon reestablishing communications.