Abstract:
A set of techniques is described for monitoring and analyzing crashes and other malfunctions in a multi-tenant computing environment (e.g. cloud computing environment). The computing environment may host many applications that are executed on different computing resource combinations. The combinations may include varying types and versions of hardware or software resources. A monitoring service is deployed to gather statistical data about the failures occurring in the computing environment. The statistical data is then analyzed to identify abnormally high failure patterns. The failure patterns may be associated with particular computing resource combinations being used to execute particular types of applications. Based on these failure patterns, suggestions can be issued to a user to execute the application using a different computing resource combination. Alternatively, the failure patterns may be used to modify or update the various resources in order to correct the potential malfunctions caused by the resource.
Abstract:
Various features are described for generating and analyzing data center topology graphs. The graphs can represent physical placement and connectivity of data center components. In some cases the graphs may include hierarchical representations of data center components and systems, and may also include environmental and operational characteristics of the computing devices and supporting systems which may be included in a data center. In addition, the graphs may be linked to each other though common components, so that data center topology may be analyzed in two or more dimensions rather than a single dimension. The linked graphs may be analyzed to identify potential points of failure and also to identify which data center components may be affected by a failure.
Abstract:
Methods and apparatus for client-allocatable bandwidth pools are disclosed. A system includes a plurality of resources of a provider network and a resource manager. In response to a determination to accept a bandwidth pool creation request from a client for a resource group, where the resource group comprises a plurality of resources allocated to the client, the resource manager stores an indication of a total network traffic rate limit of the resource group. In response to a bandwidth allocation request from the client to allocate a specified portion of the total network traffic rate limit to a particular resource of the resource group, the resource manager initiates one or more configuration changes to allow network transmissions within one or more network links of the provider network accessible from the particular resource at a rate up to the specified portion.
Abstract:
A system and method for preventing dependency problems, such as deadlocks, within network-based computing service workflows, such as workflows that occur within computing assets that provide network-based computing services. The system and method creates a remedial workflow or action for the computing services to address deadlocks or other blocking conditions within the services which may occur should the underlying computing assets need to be restarted, rebooted or sequentially execute and reach a problematic operational state. The system and method will determine the reliance of each computing service upon the functionality of one or more other network-based computing services and structure the remedial workflow accordingly. Other aspects of the disclosure are described in the detailed description, figures, and claims.
Abstract:
A service provider can maintain one or more host computing devices which may be utilized as bare metal instances by one or more customers of the service provider. Illustratively, each host computing device includes hardware components that are configured in a manner to allow the service provider to implement one or more processes upon a power cycle of the host computing device and prior to access of the host computing device resources by customers. In one aspect, a hosting platform includes components arranged in a manner to limit modifications to software or firmware on hardware components. In another aspect, the hosting platform can implement management functions for establishing control plane functions between the host computing device and the service provider that is independent of the customer. Additionally, the management functions can also be utilized to present different hardware or software attributes of the host computing device.
Abstract:
Systems and methods are disclosed that facilitate the updating of target computing devices based on versioning information. The updates to the target computing devices can utilize a series of external client workflow integration points, or integration points. The integration points allow the client computing device to interact with the computing device management component and dictate the workflow process associated with the implementation of the update procedure on the target computing device. The integration points can also be used by the client to perform additional processes specific to the client's policy-based protocols.
Abstract:
Systems and methods are disclosed that facilitate the updating of target host computing devices based on versioning information. A set of host computing devices are provisioned with a local computing device management component. Each local computing device management component periodically transmits a request to a host computing device management component to determine whether version information associated with the respective host computing device corresponds to version filter information. Based on a processing of the version filter information with the current version information of the host computing device, the host computing device management component can facilitate the implementation of updates to the requesting host computing device.
Abstract:
The transmission of data on computer networks according to one or more policies is disclosed. A policy may specify, among other things, various parameters which are to be followed when transmitting initiating network traffic. Multiple network interfaces may be installed on a server to enable transmission of data from the single server according a number of discrete configuration settings implicated by the various policies. The multiple network interfaces may correspond to separate physical components, with each component configured independently to implement a feature of a policy. The multiple network interfaces may also correspond to a single physical component that exposes multiple network interfaces, both to the network and to the server on which it is installed.
Abstract:
The transmission of data on computer networks according to one or more policies is disclosed. A policy may specify, among other things, various parameters which are to be followed when transmitting initiating network traffic. Multiple network interfaces may be installed on a server to enable transmission of data from the single server according a number of discrete configuration settings implicated by the various policies. The multiple network interfaces may correspond to separate physical components, with each component configured independently to implement a feature of a policy. The multiple network interfaces may also correspond to a single physical component that exposes multiple network interfaces, both to the network and to the server on which it is installed.
Abstract:
A set of techniques is described for enabling a user of a virtual resource to specify to the hosting system a preferred performance parameter such as throughput, latency, CPU utilization, or the like. The hosting system then dynamically tunes the underlying resources to favor the preferred performance parameter. Tuning the settings may include adjusting various batching and moderating processes that are available on the hosting device, such as enabling/disabling interrupt coalescing, enabling/disabling segmentation offload, increasing or decreasing the size of a ring buffer used to share data between several resources, batching input/output (I/O) operations and the like. For example, if the user has indicated that lower latency is preferable, the hosting system may disable interrupt coalescing; whereas if the user has indicated that higher throughput should be favored, the hosting system may enable interrupt coalescing.