Abstract:
A system that provides services to clients may receive and service requests, various ones of which may require different amounts of work. The system may determine whether it is operating in an overloaded or underloaded state based on a current work throughput rate, a target work throughput rate, a maximum request rate, or an actual request rate, and may dynamically adjust the maximum request rate in response. For example, if the maximum request rate is being exceeded, the maximum request rate may be raised or lowered, dependent on the current work throughput rate. If the target or committed work throughput rate is being exceeded, but the maximum request rate is not being exceeded, a lower maximum request rate may be proposed. Adjustments to the maximum request rate may be made using multiple incremental adjustments. Service request tokens may be added to a leaky token bucket at the maximum request rate.
Abstract:
Methods and apparatus for token-sharing mechanisms for burst-mode operations are disclosed. A first and a second token bucket are respectively configured for admission control at a first and a second work target. A number of tokens to be transferred between the first bucket and the second bucket, as well as the direction of the transfer, are determined, for example based on messages exchanged between the work targets. The token transfer is initiated, and admission control decisions at the work targets are made based on the token population resulting from the transfer.
Abstract:
Methods and apparatus for burst-mode admission control using token buckets are disclosed. A work request (such as a read or a write) directed to a work target is received. Based on a first criterion, a determination is made that the work target is in a burst mode of operation. A token population of a burst-mode token bucket is determined, and if the population meets a second criterion, the work request is accepted for execution.