Managing Data Flow in Heterogeneous Computing

    公开(公告)号:US20180074727A1

    公开(公告)日:2018-03-15

    申请号:US15266656

    申请日:2016-09-15

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing data flow management on a computing device. Embodiment methods may include initializing a buffer partition of a first memory of a first heterogeneous processing device for an output of execution of a first iteration of a first operation by the first heterogeneous processing device on which a first iteration of a second operation assigned for execution by a second heterogeneous processing device depends. Embodiment methods may include identifying a memory management operation for transmitting the output by the first heterogeneous processing device from the buffer partition as an input to the second heterogeneous processing device. Embodiment methods may include allocating a second memory for storing data for an iteration executed by a third heterogeneous processing device to minimize a number of memory management operations for the second allocated memory.

    Random-Access Disjoint Concurrent Sparse Writes to Heterogeneous Buffers

    公开(公告)号:US20170206035A1

    公开(公告)日:2017-07-20

    申请号:US15000667

    申请日:2016-01-19

    Abstract: Methods, devices, and non-transitory processor-readable storage media for a computing device to merge concurrent writes from a plurality of processing units to a buffer associated with an application. An embodiment method executed by a processor may include identifying a plurality of concurrent requests to access the buffer that are sparse, disjoint, and write-only, configuring a write-set for each of the plurality of processing units, executing the plurality of concurrent requests to access the buffer using the write-sets, determining whether each of the plurality of concurrent requests to access the buffer is complete, obtaining a buffer index and data via the write-set of each of the plurality of processing units, and writing to the buffer using the received buffer index and data via the write-set of each of the plurality of processing units in response to determining that each of the plurality of concurrent requests to access the buffer is complete.

    Identifying Enhanced Synchronization Operation Outcomes to Improve Runtime Operations

    公开(公告)号:US20170286182A1

    公开(公告)日:2017-10-05

    申请号:US15085108

    申请日:2016-03-30

    CPC classification number: G06F9/52 G06F9/46

    Abstract: Embodiments include computing devices, systems, and methods identifying enhanced synchronization operation outcomes. A computing device may receive a first resource access request for a first resource of a computing device including a first requester identifier from a first computing element of the computing device. The computing device may also receive a second resource access request for the first resource including a second requester identifier from a second computing element of the computing device. The computing device may grant the first computing element access to the first resource based on the first resource access request, and return a response to the second computing element including the first requester identifier as a winner computing element identifier.

    METHOD FOR EXPLOITING PARALLELISM IN TASK-BASED SYSTEMS USING AN ITERATION SPACE SPLITTER
    6.
    发明申请
    METHOD FOR EXPLOITING PARALLELISM IN TASK-BASED SYSTEMS USING AN ITERATION SPACE SPLITTER 有权
    使用迭代空间分割器在基于任务的系统中开发并行的方法

    公开(公告)号:US20160292012A1

    公开(公告)日:2016-10-06

    申请号:US14673857

    申请日:2015-03-30

    CPC classification number: G06F9/5066 G06F9/5027

    Abstract: Embodiments include computing devices, systems, and methods for task-based handling of repetitive processes in parallel. At least one processor of the computing device, or a specialized hardware controller, may be configured to partition iterations of a repetitive process and assign the partitions to initialized tasks to be executed in parallel by a plurality of processor cores. Upon completing a task, remaining divisible partitions of the repetitive process of ongoing tasks may be subpartitioned and assigned to the ongoing task, and the completed task or a newly initialized task. Information about the iteration space for a repetitive process may be stored in a descriptor table, and status information for all partitions of a repetitive process stored in a status table. Each processor core may have an associated local table that tracks iteration execution of each task, and is synchronized with the status table.

    Abstract translation: 实施例包括用于并行地重复处理的基于任务的处理的计算设备,系统和方法。 计算设备的至少一个处理器或专用硬件控制器可以被配置为分区重复过程的迭代,并且将分区分配给由多个处理器核并行执行的初始化任务。 完成任务后,正在执行的任务的重复进程的剩余可分区可以被分分区并分配给正在进行的任务,以及完成的任务或新初始化的任务。 关于重复过程的迭代空间的信息可以存储在描述符表中,以及存储在状态表中的重复进程的所有分区的状态信息。 每个处理器核心可以具有跟踪每个任务的迭代执行的相关联的本地表,并且与状态表同步。

    Speculative loop iteration partitioning for heterogeneous execution

    公开(公告)号:US10261831B2

    公开(公告)日:2019-04-16

    申请号:US15245604

    申请日:2016-08-24

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing speculative loop iteration partitioning (SLIP) for heterogeneous processing devices. A computing device may receive iteration information for a first partition of iterations of a repetitive process and select a SLIP heuristic based on available SLIP information and iteration information for the first partition. The computing device may determine a split value for the first partition using the SLIP heuristic, and partition the first partition using the split value to produce a plurality of next partitions.

    Managing data flow in heterogeneous computing

    公开(公告)号:US10152243B2

    公开(公告)日:2018-12-11

    申请号:US15266656

    申请日:2016-09-15

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing data flow management on a computing device. Embodiment methods may include initializing a buffer partition of a first memory of a first heterogeneous processing device for an output of execution of a first iteration of a first operation by the first heterogeneous processing device on which a first iteration of a second operation assigned for execution by a second heterogeneous processing device depends. Embodiment methods may include identifying a memory management operation for transmitting the output by the first heterogeneous processing device from the buffer partition as an input to the second heterogeneous processing device. Embodiment methods may include allocating a second memory for storing data for an iteration executed by a third heterogeneous processing device to minimize a number of memory management operations for the second allocated memory.

    Identifying enhanced synchronization operation outcomes to improve runtime operations

    公开(公告)号:US10114681B2

    公开(公告)日:2018-10-30

    申请号:US15085108

    申请日:2016-03-30

    Abstract: Embodiments include computing devices, systems, and methods identifying enhanced synchronization operation outcomes. A computing device may receive a first resource access request for a first resource of a computing device including a first requester identifier from a first computing element of the computing device. The computing device may also receive a second resource access request for the first resource including a second requester identifier from a second computing element of the computing device. The computing device may grant the first computing element access to the first resource based on the first resource access request, and return a response to the second computing element including the first requester identifier as a winner computing element identifier.

    Random-access disjoint concurrent sparse writes to heterogeneous buffers

    公开(公告)号:US10031697B2

    公开(公告)日:2018-07-24

    申请号:US15000667

    申请日:2016-01-19

    Abstract: Methods, devices, and non-transitory processor-readable storage media for a computing device to merge concurrent writes from a plurality of processing units to a buffer associated with an application. An embodiment method executed by a processor may include identifying a plurality of concurrent requests to access the buffer that are sparse, disjoint, and write-only, configuring a write-set for each of the plurality of processing units, executing the plurality of concurrent requests to access the buffer using the write-sets, determining whether each of the plurality of concurrent requests to access the buffer is complete, obtaining a buffer index and data via the write-set of each of the plurality of processing units, and writing to the buffer using the received buffer index and data via the write-set of each of the plurality of processing units in response to determining that each of the plurality of concurrent requests to access the buffer is complete.

Patent Agency Ranking