Directed event signaling for multiprocessor systems

    公开(公告)号:US09632569B2

    公开(公告)日:2017-04-25

    申请号:US14451628

    申请日:2014-08-05

    CPC classification number: G06F1/3296 G06F9/4856 G06F9/4893 G06F9/526 Y02D10/24

    Abstract: Multi-processor computing device methods manage resource accesses by a signaling event manager signaling processor elements requesting access to a resource to wake up to access the resource when the resource is available or wait for an event when the resource is busy. Processor elements may enter a sleep state while awaiting access to the requested resource. When multiple elements are waiting for the resource, the processor element with a highest assigned priority is signaled to wake up when the resource is available without waking other elements. Priorities may be assigned to processor elements waiting for the resource based on a heuristic or parameter that may depend on a state of the computing device or the processor elements. A sleep duration may be estimated for a processor element waiting for a resource and the processor element may be removed from a scheduling queue or assigned another thread during the sleep duration.

    METHOD FOR EXPLOITING PARALLELISM IN TASK-BASED SYSTEMS USING AN ITERATION SPACE SPLITTER
    14.
    发明申请
    METHOD FOR EXPLOITING PARALLELISM IN TASK-BASED SYSTEMS USING AN ITERATION SPACE SPLITTER 有权
    使用迭代空间分割器在基于任务的系统中开发并行的方法

    公开(公告)号:US20160292012A1

    公开(公告)日:2016-10-06

    申请号:US14673857

    申请日:2015-03-30

    CPC classification number: G06F9/5066 G06F9/5027

    Abstract: Embodiments include computing devices, systems, and methods for task-based handling of repetitive processes in parallel. At least one processor of the computing device, or a specialized hardware controller, may be configured to partition iterations of a repetitive process and assign the partitions to initialized tasks to be executed in parallel by a plurality of processor cores. Upon completing a task, remaining divisible partitions of the repetitive process of ongoing tasks may be subpartitioned and assigned to the ongoing task, and the completed task or a newly initialized task. Information about the iteration space for a repetitive process may be stored in a descriptor table, and status information for all partitions of a repetitive process stored in a status table. Each processor core may have an associated local table that tracks iteration execution of each task, and is synchronized with the status table.

    Abstract translation: 实施例包括用于并行地重复处理的基于任务的处理的计算设备,系统和方法。 计算设备的至少一个处理器或专用硬件控制器可以被配置为分区重复过程的迭代,并且将分区分配给由多个处理器核并行执行的初始化任务。 完成任务后,正在执行的任务的重复进程的剩余可分区可以被分分区并分配给正在进行的任务,以及完成的任务或新初始化的任务。 关于重复过程的迭代空间的信息可以存储在描述符表中,以及存储在状态表中的重复进程的所有分区的状态信息。 每个处理器核心可以具有跟踪每个任务的迭代执行的相关联的本地表,并且与状态表同步。

    Method for Exploiting Parallelism in Nested Parallel Patterns in Task-based Systems
    15.
    发明申请
    Method for Exploiting Parallelism in Nested Parallel Patterns in Task-based Systems 审中-公开
    在基于任务的系统中利用嵌套并行模式并行的方法

    公开(公告)号:US20150268993A1

    公开(公告)日:2015-09-24

    申请号:US14336288

    申请日:2014-07-21

    Abstract: Aspects include computing devices, systems, and methods for task-based handling of nested repetitive processes in parallel. At least one processor of the computing device may be configured to partition iterations of an outer repetitive process and assign the partitions to initialized tasks to be executed in parallel by a plurality of processor cores. A shadow task may be initialized for each task to execute iterations of an inner repetitive process. Upon completing a task, divisible partitions of the outer repetitive process of ongoing tasks may be subpartitioned and assigned to the ongoing task, and the completed task and shadow task or a newly initialized task and shadow task. Upon completing all but one task and one iteration of the outer repetitive process, shadow tasks may be initialized to execute partitions of iterations of the inner repetitive process.

    Abstract translation: 方面包括并行执行嵌套重复过程的基于任务的处理的计算设备,系统和方法。 计算设备的至少一个处理器可以被配置为分配外部重复过程的迭代,并且将分区分配给由多个处理器核并行执行的初始化任务。 可以为每个任务初始化影子任务以执行内部重复过程的迭代。 完成任务后,正在执行的任务的外部重复进程的可分割分区可以被分区并分配给正在进行的任务,并且完成任务和影子任务或新初始化的任务和影子任务。 完成外部重复过程的所有除了一个任务和一个迭代之外,可以对影子任务进行初始化,以执行内部重复过程的迭代分区。

    Fine-grained power optimization for heterogeneous parallel constructs

    公开(公告)号:US10296074B2

    公开(公告)日:2019-05-21

    申请号:US15417605

    申请日:2017-01-27

    Abstract: Various embodiments provide methods, devices, and non-transitory processor-readable storage media enabling joint goals, such as joint power and performance goals, to be realized on a per heterogeneous processing device basis for heterogeneous parallel computing constructs. Various embodiments may enable assignments of power states for heterogeneous processing devices on a per heterogeneous processing device basis to satisfy an overall goal on the heterogeneous processing construct. Various embodiments may enable dynamic adjustment of power states for heterogeneous processing devices on a per heterogeneous processing device basis.

    Method for simplified task-based runtime for efficient parallel computing

    公开(公告)号:US10169105B2

    公开(公告)日:2019-01-01

    申请号:US14992268

    申请日:2016-01-11

    Abstract: Aspects include computing devices, systems, and methods for implementing scheduling and execution of lightweight kernels as simple tasks directly by a thread without setting up a task structure. A computing device may determine whether a task pointer in a task queue is a simple task pointer for the lightweight kernel. The computing device may schedule a first simple task for the lightweight kernel for execution by the thread. The computing device may retrieve, from an entry of a simple task table, a kernel pointer for the lightweight kernel. The entry in the simple task table may be associated with the simple task pointer. The computing device may directly execute the lightweight kernel as the simple task.

    Speculative Loop Iteration Partitioning for Heterogeneous Execution

    公开(公告)号:US20180060130A1

    公开(公告)日:2018-03-01

    申请号:US15245604

    申请日:2016-08-24

    CPC classification number: G06F9/5027 G06F9/5066 G06F2209/5017

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing speculative loop iteration partitioning (SLIP) for heterogeneous processing devices. A computing device may receive iteration information for a first partition of iterations of a repetitive process and select a SLIP heuristic based on available SLIP information and iteration information for the first partition. The computing device may determine a split value for the first partition using the SLIP heuristic, and partition the first partition using the split value to produce a plurality of next partitions.

    Shared Virtual Index for Memory Object Fusion in Heterogeneous Cooperative Computing

    公开(公告)号:US20180052776A1

    公开(公告)日:2018-02-22

    申请号:US15239937

    申请日:2016-08-18

    CPC classification number: G06F12/109 G06F2212/1041 G06F2212/657

    Abstract: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for implementing shared virtual index translation on a computing device. The computing device may receive a base virtual address for storing an output of a kernel function execution to a dedicated memory and determine whether the virtual address is in a range of virtual addresses for a privatized output buffer within the dedicated memory, which may be smaller than the dedicated memory. The computing device may calculate a first modified physical address using a physical address mapped to the base virtual address and an offset of a first processing device associated with the dedicated memory in response to determining that the base virtual address is in the range of virtual addresses. The computing device may store the output of the kernel function execution to the privatized output buffer at the first modified physical address.

    Method For Simplified Task-based Runtime For Efficient Parallel Computing
    20.
    发明申请
    Method For Simplified Task-based Runtime For Efficient Parallel Computing 审中-公开
    用于简化的基于任务的运行时间进行高效并行计算的方法

    公开(公告)号:US20170031728A1

    公开(公告)日:2017-02-02

    申请号:US14992268

    申请日:2016-01-11

    CPC classification number: G06F9/52 G06F9/4843

    Abstract: Aspects include computing devices, systems, and methods for implementing scheduling and execution of lightweight kernels as simple tasks directly by a thread without setting up a task structure. A computing device may determine whether a task pointer in a task queue is a simple task pointer for the lightweight kernel. The computing device may schedule a first simple task for the lightweight kernel for execution by the thread. The computing device may retrieve, from an entry of a simple task table, a kernel pointer for the lightweight kernel. The entry in the simple task table may be associated with the simple task pointer. The computing device may directly execute the lightweight kernel as the simple task.

    Abstract translation: 方面包括计算设备,系统和方法,用于直接通过线程实现轻量级内核的调度和执行,而无需设置任务结构。 计算设备可以确定任务队列中的任务指针是否是轻量级内核的简单任务指针。 计算设备可以安排轻量级内核的第一简单任务以供线程执行。 计算设备可以从简单任务表的条目中检索轻量级内核的内核指针。 简单任务表中的条目可能与简单任务指针相关联。 计算设备可以直接执行轻量级内核作为简单任务。

Patent Agency Ranking