摘要:
Determining a schedule of a batch workload of MapReduce jobs is disclosed. A set of multi-stage jobs for processing in a MapReduce framework is received, for example, in a master node. Each multi-stage job includes a duration attribute, and each duration attribute includes a stage duration and a stage type. The MapReduce framework is separated into a plurality of resource pools. The multi-stage jobs are separated into a plurality of subgroups corresponding with the plurality of pools. Each subgroup is configured for concurrent processing in the MapReduce framework. The multi-stage jobs in each of the plurality of subgroups are placed in an order according to increasing stage duration. For each pool, the multi-stage jobs in increasing order of stage duration are sequentially assigned from either a front of the schedule or a tail of the schedule by stage type.
摘要:
Systems and methods are used to provide distributed processing on a service provider network that includes a plurality of remotely located consumer devices. Each of the remotely located consumer devices includes a processing device. A service is provided from the service provider network to the remotely located consumer devices. Distributed processing of a task on the processing devices of the remotely located consumer devices occurs, the distributed processing being unrelated to the service provided to the consumers. The distributed processing occurs even when the processing devices are in use by corresponding remotely located consumer devices.
摘要:
A method of rasterising a document using a plurality of threads interprets objects of the document by performing interpreting tasks associated with the objects. Objects associated with different pages are interpreted in parallel. A plurality of rasterising tasks associated with the performed interpreting tasks are established, each performed interpreting task establishing a plurality of rasterising tasks. The method estimates an amount of parallelisable work available to be performed using the plurality of threads. The amount of parallelisable work is estimated using the established rasterising tasks and an expected number of interpreting tasks to be performed. The method selects, based on the estimated amount of parallelisable work, one of (i) an interpreting task to interpret objects of the document, and (ii) a rasterising task from the established plurality of rasterising tasks, and then executes the selected task using at least one thread to rasterize the document.
摘要:
Methods and systems to process a request received at an application program interface are described. The system receives a request from a client machine that includes a job that is associated with data. The request is received at an application program interface. Next, a peer-to-peer network of processing nodes generates a plurality of sub-jobs based on the job. The peer-to-peer network of processing nodes schedules the plurality of sub-jobs for parallel processing based on an availability of resources that are respectively utilized by the sub-jobs and parallel processes the plurality of sub-jobs before generating task results that are respectively associated with the plurality of sub-jobs.
摘要:
A software engine for decomposing work to be done into tasks, and distributing the tasks to multiple, independent CPUs for execution is described. The engine utilizes dynamic code generation, with run-time specialization of variables, to achieve high performance. Problems are decomposed according to methods that enhance parallel CPU operation, and provide better opportunities for specialization and optimization of dynamically generated code. A specific application of this engine, a software three dimensional (3D) graphical image renderer, is described.
摘要:
A batch job processing architecture that dynamically creates runtime tasks for batch job execution and to optimize parallelism. The task creation can be based on the amount of processing power available locally or across batch servers. The work can be allocated across multiple threads in multiple batch server instances as there are available. A master task splits the items to be processed into smaller parts and creates a runtime task for each. The batch server picks up and executes as many runtime tasks as the server is configured to handle. The runtime tasks can be run in parallel to maximize hardware utilization. Scalability is provided by splitting runtime task execution across available batch server instances, and also across machines. During runtime task creation, all dependency and batch group information is propagated from the master task to all runtime tasks. Dependencies and batch group configuration are honored by the batch engine.
摘要:
Various embodiments relating to performing multiple computations are provided. In one embodiment, a computing system includes an off-chip storage device configured to store a plurality of stream elements and associated tags and a computation device. The computation device includes an on-chip storage device configured to store a plurality of independently addressable resident elements, and a plurality of parallel processing units. Each parallel processing unit may be configured to receive one or more stream elements and associated tags from the off-chip storage device and select one or more resident elements from a subset of resident elements driven in parallel from the on-chip storage device. A selected resident element may be indicated by an associated tag as matching a stream element. Each parallel processing unit may be configured to perform one or more computations using the one or more stream elements and the one or more selected resident elements.
摘要:
A host computer has one or more physical central processing units (CPUs) that support the execution of a plurality of containers, where the containers each include one or more processes. Each process of a container is assigned to execute exclusively on a corresponding physical CPU when the corresponding container is determined to be latency sensitive. The assignment of a process to execute exclusively on a corresponding physical CPU includes the migration of tasks from the corresponding physical CPU to one or more other physical CPUs of the host system, and the directing of task and interrupt processing to the one or more other physical CPUs. Tasks of the process corresponding to the container are then executed on the corresponding physical CPU.
摘要:
Techniques to control power and processing among a plurality of asymmetric cores. In one embodiment, one or more asymmetric cores are power managed to migrate processes or threads among a plurality of cores according to the performance and power needs of the system.
摘要:
Systems and method for a task scheduler with dynamic adjustment of concurrency levels and task granularity are disclosed for improved execution of highly concurrent analytical and transactional systems. The task scheduler can avoid both over commitment and underutilization of computing resources by monitoring and controlling the number of active worker threads. The number of active worker threads can be adapted to avoid underutilization of computing resources by giving the OS control of additional worker threads processing blocked application tasks. The task scheduler can dynamically determine a number of parallel operations for a particular task based on the number of available threads. The number of available worker threads can be determined based on the average availability of worker threads in the recent history of the application. Based on the number of available worker threads, the partitionable operation can be partitioned into a number of sub operations and executed in parallel.