Abstract:
Methods and apparatus for equitable distribution of excess shared-resource throughput capacity are disclosed. A first and a second work target are configured to access a shared resource to implement accepted work requests. Admission control is managed at the work targets using respective token buckets. A first metric indicative of the work request arrival rates at the work targets during a time interval, and a second metric associated with the provisioned capacities of the work targets are determined. A number of tokens determined based on a throughput limit of the shared resource is distributed among the work targets to be used for admission control during a subsequent time interval. The number of tokens distributed to each work target is based on the first metric and/or the second metric.
Abstract:
Methods and apparatus for token-based pricing policies for burst-mode operations are disclosed. A pricing policy to be applied to token population changes at a token bucket used for admission control during burst-mode operations at a work target is determined. Over a time period, changes to the token population of that bucket are recorded. A billing amount to be charged to a client is determined, based on the recorded changes in token population and an associated pricing amount indicated in the pricing policy.
Abstract:
Methods and apparatus for compound token buckets usable for burst-mode admission control are disclosed. A peak burst rate and a sustained burst rate of work requests that are to be supported at a work target are determined. The maximum token populations of a peak-burst token bucket and a sustained-burst token bucket are configured, based on the peak burst rate and the sustained burst rate respectively. In response to receiving a work request directed at the work target, a determination to accept the work request for execution is made based at least in part on the token population of the peak-burst token bucket and/or the sustained-burst token bucket.
Abstract:
Methods and apparatus for token-based admission control for replicated writes are disclosed. Data objects are divided into partitions, and corresponding to each partition, at least a master replica and a slave replica are stored. A determination as to whether to accept a write request directed to the partition is made based at least in part on one or more of (a) available throughput capacity at the master replica, and (b) an indication, obtained using a token-based protocol, of available throughput capacity at the slave replica. If the write request is accepted, one or more data modification operations are initiated.
Abstract:
A system that provides services to clients may receive and service requests, various ones of which may require different amounts of work. An admission control mechanism may manage requests based on tokens, each of which represents a fixed amount of work. The tokens may be added to a token bucket at rate that is dependent on a target work throughput rate while the number of tokens in the bucket does not exceed its maximum capacity. If at least a pre-determined minimum number of tokens is present in the bucket when a service request is received, it may be serviced. Servicing a request may include deducting an initial number of tokens from the bucket, determining that the amount of work performed in servicing the request is different than that represented by the initially deducted tokens, and deducting additional tokens from or replacing tokens in the bucket to reflect the difference.
Abstract:
A system that provides services to clients may receive and service requests, various ones of which may require different amounts of work. The system may determine whether it is operating in an overloaded or underloaded state based on a current work throughput rate, a target work throughput rate, a maximum request rate, or an actual request rate, and may dynamically adjust the maximum request rate in response. For example, if the maximum request rate is being exceeded, the maximum request rate may be raised or lowered, dependent on the current work throughput rate. If the target or committed work throughput rate is being exceeded, but the maximum request rate is not being exceeded, a lower maximum request rate may be proposed. Adjustments to the maximum request rate may be made using multiple incremental adjustments. Service request tokens may be added to a leaky token bucket at the maximum request rate.
Abstract:
Methods and apparatus for token-sharing mechanisms for burst-mode operations are disclosed. A first and a second token bucket are respectively configured for admission control at a first and a second work target. A number of tokens to be transferred between the first bucket and the second bucket, as well as the direction of the transfer, are determined, for example based on messages exchanged between the work targets. The token transfer is initiated, and admission control decisions at the work targets are made based on the token population resulting from the transfer.
Abstract:
Methods and apparatus for burst-mode admission control using token buckets are disclosed. A work request (such as a read or a write) directed to a work target is received. Based on a first criterion, a determination is made that the work target is in a burst mode of operation. A token population of a burst-mode token bucket is determined, and if the population meets a second criterion, the work request is accepted for execution.