Abstract:
Embodiments of the present disclosure are directed to, among other things, validating the integrity of received and/or stored data payloads. In some examples, a storage service may perform a first partitioning of a data object into first partitions based at least in part on a first operation. The storage service may also verify the data object, by utilizing a verification algorithm, to generate a first verification value. In some cases, the storage service may additionally perform a second partitioning of the data object into second partitions based at least in part on a second operation. The second partitions may be different from the first partitions. Additionally, the archival data storage service may verify the data object using the verification algorithm to generate a second verification value. Further, the storage service may determine whether the second verification value equals the first verification value.
Abstract:
In this disclosure, a resource scheduler is described that allows virtual machine instances to earn resource credits during the low activity levels. Virtual machine instances that spend a predominant amount of time operating at low activity levels are able to quickly gain resource credits. Once these virtual machine instances acquire enough resource credits to surpass a threshold level, the resource scheduler can assign a high priority level to the virtual machine instances that provide them with priority access to CPU resources. The next time that the virtual machine instances enter a high activity level, they have a high priority level that allows them to preempt other, lower priority virtual machine instances. Thus, these virtual machine instances are able to process operations and/or respond to user requests with low latency.
Abstract:
In a system in which documents are generated dynamically in response to user requests, historical data is collected regarding data retrieval subtasks, such as service requests, that are performed to generate such documents. This data is used to predict the specific subtasks that will be performed to respond to specific document requests, such that these subtasks may be initiated preemptively at or near the outset of the associated document generation task. In one embodiment, the historical data is included within, or is used to generate, a mapping table that maps document generation tasks (which may correspond to specific URLs) to the data retrieval subtasks that are frequently performed within such tasks.
Abstract:
A request to retrieve a persistently stored data object is received, the request including a data object identifier that encodes at least storage location information and validation information related to the data object. The data object is retrieved using at least the storage location information to form a retrieved data object, and validation is performed using at least the validation information.
Abstract:
An application programming interface for a data storage service provides a convenient mechanism for clients of the data storage service to access its various capabilities. An API call may be made to initiate a job and in response a job identifier may be provided. A separate API call specifying the job identifier may be made and a response providing information related to the job may result. Various API calls may be used to store data, retrieve data, obtain an inventory of stored data, and to obtain other information relating to stored data.
Abstract:
Methods and systems are described herein to provide efficient data retrieval in a data storage system. Specifically, in cases where users of a data storage system are not overly sensitive to data retrieval time, such as the case for backup and archival data storage systems, random read requests may be fulfilled as part of sequential reads to reduce I/O operations. A data storage system may be divided into data storage zones. Sequential reads may be performed for data stored in those data storage zones with pending data retrieval requests. Data retrieval requests may be fulfilled based at least in part on the sequentially-read data.
Abstract:
Embodiments of the present disclosure are directed to, among other things, validating the integrity of received and/or stored data payloads. In some examples, a storage service may perform a first partitioning of a data object into first partitions based at least in part on a first operation. The storage service may also verify the data object, by utilizing a verification algorithm, to generate a first verification value. In some cases, the storage service may additionally perform a second partitioning of the data object into second partitions based at least in part on a second operation. The second partitions may be different from the first partitions. Additionally, the archival data storage service may verify the data object using the verification algorithm to generate a second verification value. Further, the storage service may determine whether the second verification value equals the first verification value.
Abstract:
Embodiments of the present disclosure are directed to, among other things, validating the integrity of received and/or stored data payloads. In some examples, a storage service may perform a first partitioning of a data object into first partitions based at least in part on a first operation. The storage service may also verify the data object, by utilizing a verification algorithm, to generate a first verification value. In some cases, the storage service may additionally perform a second partitioning of the data object into second partitions based at least in part on a second operation. The second partitions may be different from the first partitions. Additionally, the archival data storage service may verify the data object using the verification algorithm to generate a second verification value. Further, the storage service may determine whether the second verification value equals the first verification value.
Abstract:
Techniques for optimizing data storage are disclosed herein. In particular, methods and systems for implementing redundancy encoding schemes with data storage systems are described. The redundancy encoding schemes may be scheduled according to system and data characteristics. The schemes may span multiple tiers or layers of a storage system. The schemes may be generated, for example, in accordance with a transaction rate requirement, a data durability requirement or in the context of the age of the stored data. The schemes may be designed to rectify entropy-related effects upon data storage. The schemes may include one or more erasure codes or erasure coding schemes. Additionally, methods and systems for improving and/or accounting for failure correlation of various components of the storage system, including that of storage devices such as hard disk drives, are described.
Abstract:
An application programming interface for a data storage service provides a convenient mechanism for clients of the data storage service to access its various capabilities. An API call may be made to initiate a job and in response a job identifier may be provided. A separate API call specifying the job identifier may be made and a response providing information related to the job may result. Various API calls may be used to store data, retrieve data, obtain an inventory of stored data, and to obtain other information relating to stored data.