Abstract:
Approaches are disclosed for enabling owners of virtual computing resources to specify one or more constraints for their virtual machines and/or virtual networks, with respect to metrics such as cost, latency, throughput, network bandwidth, power usage, server availability, data redundancy, correlated failure susceptibility, and other such metrics. A customer can declare a set of constraints with metrics goals for their virtual machine instance or network of instances, and the service provider can optimize the placement (e.g., host selection) and various settings (e.g., hardware and software settings) to satisfy the specified constraints. The satisfaction of customer-specified constraints may need to take into account what other virtual machine instances are performing in the shared resource environment.
Abstract:
A system includes a rack, a plurality of shelves, a plurality of shelf-mountable electrical systems, and an inter-shelf power-pooling bus. The inter-shelf power-pooling bus is coupled to a power output of a shelf power supply mechanism on each of the shelves and a power input of a shelf computing device on each of the shelves. The inter-shelf power-pooling bus supplies pooled power from the shelf power supply mechanisms coupled to the inter-shelf power bus to the shelf computing devices coupled to the inter-shelf power-pooling bus.
Abstract:
An asset health monitoring system (AHMS) can assign a confidence indicator to some or all the monitored computing asset in a data center, such as computing systems or networking devices. In response to drops in the confidence indicators, the AHMS can automatically initiate testing of computing assets in order to raise confidence that the asset will perform correctly. Further, the AHMS can automatically initiate remediation procedures for computing assets that fail the confidence testing. By automatically triggering testing of assets and/or remediation procedures, the AHMS can increase reliability for the data center by preemptively identifying problems.
Abstract:
Disclosed are various embodiments of a computing device for validating the configuration of components of a component assembly. The computing device serves a boot image executable by a component of the component assembly. Expected configuration data associated with the component is identified by the computing device, and actual configuration data associated with the component is obtained by the computing device. The computing device determines a validation response for the component assembly based at least in part upon a comparison of the expected configuration data and the actual configuration data.
Abstract:
Generally described, systems and methods are provided for monitoring and detecting causes of failures of network paths. The system collects performance information from a plurality of nodes and links in a network, aggregates the collected performance information across paths in the network, processes the aggregated performance information for detecting failures on the paths, analyzes each of the detected failures to determine at least one root cause, and initiates a remedial workflow for the at least one root cause determined. In some aspects, processing the aggregated information may include performing a statistical regression analysis or otherwise solving a set of equations for the performance indications on each of a plurality of paths. In another aspect, the system may also include an interface which makes available for display one or more of the network topology, the collected and aggregated performance information, and indications of the detected failures in the topology.
Abstract:
Generally described, systems and methods are provided for monitoring and detecting causes of failures of network paths. The system collects performance information from a plurality of nodes and links in a network, aggregates the collected performance information across paths in the network, processes the aggregated performance information for detecting failures on the paths, analyzes each of the detected failures to determine at least one root cause, and initiates a remedial workflow for the at least one root cause determined. In some aspects, processing the aggregated information may include performing a statistical regression analysis or otherwise solving a set of equations for the performance indications on each of a plurality of paths. In another aspect, the system may also include an interface which makes available for display one or more of the network topology, the collected and aggregated performance information, and indications of the detected failures in the topology.
Abstract:
Operating profiles for consumers of computing resources may be automatically determined based on an analysis of actual resource usage measurements and other operating metrics. Measurements may be taken while a consumer, such as a virtual machine instance, uses computing resources, such as those provided by a host. A profile may be dynamically determined based on those measurements. Profiles may be generalized such that groups of consumers with similar usage profiles are associated with a single profile. Assignment decisions may be made based on the profiles, and computing resources may be reallocated or oversubscribed if the profiles indicate that the consumers are unlikely to fully utilize the resources reserved for them. Oversubscribed resources may be monitored, and consumers may be transferred to different resource providers if contention for resources is too high.
Abstract:
Systems and methods are described for managing computing resources. In one embodiment, groupings of computer resources having common firmware settings are maintained based on an abstraction firmware framework representing associations between vendor-specific firmware settings and abstracted firmware settings that provide a degree of independence from specific vendor-specific firmware settings. In response to a request for a computer resource with a specified abstracted firmware configuration, it is determined which of the groupings can support the specified abstracted firmware configuration based on at least one criterion for managing the computer resources in accordance with the abstraction firmware framework.
Abstract:
A trusted computing host is described that provides various security computations and other functions in a distributed multitenant and/or virtualized computing environment. The trusted host computing device can communicate with one or more host computing devices that host virtual machines to provide a number of security-related functions, including but not limited to boot firmware measurement, cryptographic key management, remote attestation, as well as security and forensics management. The trusted computing host maintains an isolated partition for each host computing device in the environment and communicates with peripheral cards on host computing devices in order to provide one or more security functions.
Abstract:
A computing system includes a chassis, one or more backplanes coupled to the chassis. Computing devices are coupled to the one or more backplanes. The one or more backplanes include backplane openings that allow air to pass from one side of the backplane to the other side of the backplane. Air channels are formed by adjacent circuit board assemblies of the computing devices and the one or more backplanes. Channel capping elements at least partially close the air channels.