Proactive Resource Management for Parallel Work-Stealing Processing Systems

    公开(公告)号:US20170083364A1

    公开(公告)日:2017-03-23

    申请号:US14862373

    申请日:2015-09-23

    Abstract: Various embodiments proactively balance workloads between a plurality of processing units of a multi-processor computing device by making work-stealing determinations based on operating state data. An embodiment method includes obtaining static characteristics data associated with each of a victim processor and one or more of a plurality of processing units that are ready to steal work items from the victim processor (work-ready processors), obtaining dynamic characteristics data for each of the processors, calculating priority values for each of the processors based on the obtained data, and transferring a number of work items assigned to the victim processor to a winning work-ready processor based on the calculated priority values. In some embodiments, the method may include acquiring control over a probabilistic lock for a shared data structure and updating the shared data structure to indicate the number of work items transferred to the winning work-ready processor.

    Managing Data Flow in Heterogeneous Computing

    公开(公告)号:US20180074727A1

    公开(公告)日:2018-03-15

    申请号:US15266656

    申请日:2016-09-15

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing data flow management on a computing device. Embodiment methods may include initializing a buffer partition of a first memory of a first heterogeneous processing device for an output of execution of a first iteration of a first operation by the first heterogeneous processing device on which a first iteration of a second operation assigned for execution by a second heterogeneous processing device depends. Embodiment methods may include identifying a memory management operation for transmitting the output by the first heterogeneous processing device from the buffer partition as an input to the second heterogeneous processing device. Embodiment methods may include allocating a second memory for storing data for an iteration executed by a third heterogeneous processing device to minimize a number of memory management operations for the second allocated memory.

    Speculative loop iteration partitioning for heterogeneous execution

    公开(公告)号:US10261831B2

    公开(公告)日:2019-04-16

    申请号:US15245604

    申请日:2016-08-24

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing speculative loop iteration partitioning (SLIP) for heterogeneous processing devices. A computing device may receive iteration information for a first partition of iterations of a repetitive process and select a SLIP heuristic based on available SLIP information and iteration information for the first partition. The computing device may determine a split value for the first partition using the SLIP heuristic, and partition the first partition using the split value to produce a plurality of next partitions.

    Managing data flow in heterogeneous computing

    公开(公告)号:US10152243B2

    公开(公告)日:2018-12-11

    申请号:US15266656

    申请日:2016-09-15

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing data flow management on a computing device. Embodiment methods may include initializing a buffer partition of a first memory of a first heterogeneous processing device for an output of execution of a first iteration of a first operation by the first heterogeneous processing device on which a first iteration of a second operation assigned for execution by a second heterogeneous processing device depends. Embodiment methods may include identifying a memory management operation for transmitting the output by the first heterogeneous processing device from the buffer partition as an input to the second heterogeneous processing device. Embodiment methods may include allocating a second memory for storing data for an iteration executed by a third heterogeneous processing device to minimize a number of memory management operations for the second allocated memory.

    Identifying enhanced synchronization operation outcomes to improve runtime operations

    公开(公告)号:US10114681B2

    公开(公告)日:2018-10-30

    申请号:US15085108

    申请日:2016-03-30

    Abstract: Embodiments include computing devices, systems, and methods identifying enhanced synchronization operation outcomes. A computing device may receive a first resource access request for a first resource of a computing device including a first requester identifier from a first computing element of the computing device. The computing device may also receive a second resource access request for the first resource including a second requester identifier from a second computing element of the computing device. The computing device may grant the first computing element access to the first resource based on the first resource access request, and return a response to the second computing element including the first requester identifier as a winner computing element identifier.

    Adaptive Chunk Size Tuning for Data Parallel Processing on Multi-core Architecture

    公开(公告)号:US20170083365A1

    公开(公告)日:2017-03-23

    申请号:US14862398

    申请日:2015-09-23

    CPC classification number: G06F9/4881 G06F9/465 G06F9/4843

    Abstract: Methods, devices, and non-transitory process-readable storage media for dynamically adapting a frequency for detecting work-stealing operations in a multi-processor computing device. A method according to various embodiments and performed by a processor includes determining whether any work items of a cooperative task have been reassigned from a first processing unit to a second processing unit, calculating a chunk size using a default equation in response to determining that no work items of the cooperative task have been reassigned from the first processing unit, calculating the chunk size using a victim equation in response to determining that one or more work items of the cooperative task have been reassigned from the first processing unit, and executing a set of work items of the cooperative task that correspond to the calculated chunk size.

    Method for exploiting parallelism in task-based systems using an iteration space splitter
    7.
    发明授权
    Method for exploiting parallelism in task-based systems using an iteration space splitter 有权
    使用迭代空间分离器在基于任务的系统中利用并行性的方法

    公开(公告)号:US09501328B2

    公开(公告)日:2016-11-22

    申请号:US14673857

    申请日:2015-03-30

    CPC classification number: G06F9/5066 G06F9/5027

    Abstract: Embodiments include computing devices, systems, and methods for task-based handling of repetitive processes in parallel. At least one processor of the computing device, or a specialized hardware controller, may be configured to partition iterations of a repetitive process and assign the partitions to initialized tasks to be executed in parallel by a plurality of processor cores. Upon completing a task, remaining divisible partitions of the repetitive process of ongoing tasks may be subpartitioned and assigned to the ongoing task, and the completed task or a newly initialized task. Information about the iteration space for a repetitive process may be stored in a descriptor table, and status information for all partitions of a repetitive process stored in a status table. Each processor core may have an associated local table that tracks iteration execution of each task, and is synchronized with the status table.

    Abstract translation: 实施例包括用于并行地重复处理的基于任务的处理的计算设备,系统和方法。 计算设备的至少一个处理器或专用硬件控制器可以被配置为分区重复过程的迭代,并且将分区分配给由多个处理器核并行执行的初始化任务。 完成任务后,正在执行的任务的重复进程的剩余可分区可以被分分区并分配给正在进行的任务,以及完成的任务或新初始化的任务。 关于重复过程的迭代空间的信息可以存储在描述符表中,以及存储在状态表中的重复进程的所有分区的状态信息。 每个处理器核心可以具有跟踪每个任务的迭代执行的相关联的本地表,并且与状态表同步。

    Proactive resource management for parallel work-stealing processing systems

    公开(公告)号:US10360063B2

    公开(公告)日:2019-07-23

    申请号:US14862373

    申请日:2015-09-23

    Abstract: Various embodiments proactively balance workloads between a plurality of processing units of a multi-processor computing device by making work-stealing determinations based on operating state data. An embodiment method includes obtaining static characteristics data associated with each of a victim processor and one or more of a plurality of processing units that are ready to steal work items from the victim processor (work-ready processors), obtaining dynamic characteristics data for each of the processors, calculating priority values for each of the processors based on the obtained data, and transferring a number of work items assigned to the victim processor to a winning work-ready processor based on the calculated priority values. In some embodiments, the method may include acquiring control over a probabilistic lock for a shared data structure and updating the shared data structure to indicate the number of work items transferred to the winning work-ready processor.

    Identifying Enhanced Synchronization Operation Outcomes to Improve Runtime Operations

    公开(公告)号:US20170286182A1

    公开(公告)日:2017-10-05

    申请号:US15085108

    申请日:2016-03-30

    CPC classification number: G06F9/52 G06F9/46

    Abstract: Embodiments include computing devices, systems, and methods identifying enhanced synchronization operation outcomes. A computing device may receive a first resource access request for a first resource of a computing device including a first requester identifier from a first computing element of the computing device. The computing device may also receive a second resource access request for the first resource including a second requester identifier from a second computing element of the computing device. The computing device may grant the first computing element access to the first resource based on the first resource access request, and return a response to the second computing element including the first requester identifier as a winner computing element identifier.

Patent Agency Ranking