Abstract:
Techniques for scheduling a plurality of jobs sharing input are provided. The techniques include partitioning one or more input datasets into multiple subcomponents, analyzing a plurality of jobs to determine which of the plurality of jobs require scanning of one or more common subcomponents of the one or more input datasets, and scheduling a plurality of jobs that require scanning of one or more common subcomponents of the one or more input datasets, facilitating a single scanning of the one or more common subcomponents to be used as input by each of the plurality of jobs.
Abstract:
Techniques for scheduling one or more MapReduce jobs in a presence of one or more priority classes are provided. The techniques include obtaining a preferred ordering for one or more MapReduce jobs, wherein the preferred ordering comprises one or more priority classes, prioritizing the one or more priority classes subject to one or more dynamic minimum slot guarantees for each priority class, and iteratively employing a MapReduce scheduler, once per priority class, in priority class order, to optimize performance of the one or more MapReduce jobs.
Abstract:
Methods and arrangements for planning and scheduling change management requests in computing systems are disclosed. Included are an arrangement for deciding whether or not an RFC should be done, an arrangement for assigning individual tasks to acceptable servers for each RFC to be done, and an arrangement for assigning the start times to said individual tasks for each RFC to be done.
Abstract:
A method is provided for generating a resource function estimate of resource usage by an instance of a processing element configured to consume zero or more input data streams in a stream processing system having a set of available resources that comprises receiving at least one specified performance metric for the zero or more input data streams and a processing power of the set of available resources, wherein one specified performance metric is stream rate; generating a multi-part signature of executable-specific information for the processing element and a multi-part signature of context-specific information for the instance; accessing a database of resource functions to identify a static resource function corresponding to the executable-specific information and a context-dependent resource function corresponding to the context-specific information; combining the static resource function and the context-dependent resource function to form a composite resource function for the instance; and applying the resource function to the at least one specified performance metric and the processing power to generate the resource function estimate of the at least one specified performance metric for processing by the instance.
Abstract:
A method, system, and computer program product for implementing stream processing are provided. The system includes an application framework and applications containing dataflow graphs managed by the application framework running on a first network. The system also includes at least one circuit switch in the first network having a configuration that is controlled by the application framework, a plurality of processing nodes interconnected by the first network over one of wireline and wireless links, and a second network for providing at least one of control and additional data transfer over the first network. The application framework reconfigures circuit switches in response to monitoring aspects of the applications and the first network.
Abstract:
Techniques for performing capacity planning for applications running on a computational infrastructure are provided. The techniques include instrumenting an application under development to receive one or more performance metrics under a physical deployment plan, receiving the one or more performance metrics from the computational infrastructure hosting one or more applications that are currently running, using a predictive inference engine to determine how the application under development can be deployed, and using the determination to perform capacity planning for the applications on the computational infrastructure.
Abstract:
A method, apparatus, and computer program product for scheduling stream-based applications in a distributed computer system with configurable networks are provided. The method includes choosing, at a highest temporal level, jobs that will run, an optimal template alternative for the jobs that will run, network topology, and candidate processing nodes for processing elements of the optimal template alternative for each running job to maximize importance of work performed by the system. The method further includes making, at a medium temporal level, fractional allocations and re-allocations of the candidate processing elements to the processing nodes in the system to react to changing importance of the work. The method also includes revising, at a lowest temporal level, the fractional allocations and re-allocations on a continual basis to react to burstiness of the work, and to differences between projected and real progress of the work.
Abstract:
Methods and arrangements for automatically determining allowable sequences of changes, e.g., sequences where the order in which changes are carried out will transition a computing system from a workable state into another workable state, are disclosed.
Abstract:
Methods and arrangements for operating distributed computing systems, and more particularly, to techniques for constructing and analyzing change plans are disclosed. Included are an arrangement for submitting a request for change to the system, an arrangement for specifying the order in which tasks execute in compliance with data and temporal dependency constraints; and an arrangement for creating a change plan.
Abstract:
Disclosed is a method for controlling a web farm having a plurality of websites and servers, the method comprising categorizing customer requests received from said websites into a plurality of categories, said categories comprising a shareable customer requests and unshareable customer requests, routing said shareable customer requests such that any of said servers may process shareable customer requests received from different said websites, and routing said unshareable customer requests from specific said websites only to specific servers to which said specific websites have been assigned.