Abstract:
A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
Abstract:
A method, computer program product, and system is described that determines the correctness of using memory operations in a computing device with heterogeneous computer components. Embodiments include an optimizer based on the characteristics of a Sequential Consistency for Heterogeneous-Race-Free (SC for HRF) model that analyzes a program and determines the correctness of the ordering of events in the program. HRF models include combinations of the properties: scope order, scope inclusion, and scope transitivity. The optimizer can determine when a program is heterogeneous-race-free in accordance with an SC for HRF memory consistency model. For example, the optimizer can analyze a portion of program code, respect the properties of the SC for HRF model, and determine whether a value produced by a store memory event will be a candidate for a value observed by a load memory event. In addition, the optimizer can determine whether reordering of events is possible.
Abstract translation:描述了一种方法,计算机程序产品和系统,其确定在具有异构计算机组件的计算设备中使用存储器操作的正确性。 实施例包括基于用于异构无竞争(SC for HRF)的顺序一致性的特性的优化器,该模型分析程序并确定程序中的事件的顺序的正确性。 HRF模型包括属性的组合:范围顺序,范围包含和范围传递性。 优化器可以根据HR对HRF内存一致性模型的SC来确定程序何时是异构无竞争的。 例如,优化器可以分析程序代码的一部分,尊重SC的HRF模型的属性,并且确定由存储器存储器事件产生的值是否将是由加载存储器事件观察到的值的候选。 此外,优化器可以确定是否可能重新排序事件。
Abstract:
Methods and systems are provided for inferring types in a computer program. In one example, a method comprises: identifying a type of at least one expression of the computer program; and annotating the at least one expression in the computer program when the type of the at least one expression is at least one of a varying type and a uniform type.
Abstract:
A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.
Abstract:
Disclosed methods, systems, and computer program products embodiments include synchronizing a group of workitems on a processor by storing a respective program counter associated with each of the workitems, selecting at least one first workitem from the group for execution, and executing the selected at least one first workitem on the processor. The selecting is based upon the respective stored program counter associated with the at least one first workitem.
Abstract:
Some embodiments include a processing subsystem that compiles program code to generate compiled program code. In these embodiments, while compiling the program code, the processing subsystem first identifies a pointer in the program code that points to an unspecified address space. The processing subsystem then analyzes at least a portion of the program code to determine one or more address spaces to which the pointer may point. Next, the processor updates metadata for the pointer to indicate the one or more address spaces to which the pointer may point, the metadata enabling a determination of an address space to which the pointer points during subsequent execution of the compiled program code.
Abstract:
With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.
Abstract:
With the success of programming models such as OpenCL and CUDA, heterogeneous computing platforms are becoming mainstream. However, these heterogeneous systems are low-level, not composable, and their behavior is often implementation defined even for standardized programming models. In contrast, the method and system embodiments for the heterogeneous parallel primitives (HPP) programming model disclosed herein provide a flexible and composable programming platform that guarantees behavior even in the case of developing high-performance code.
Abstract:
Some embodiments include a processing subsystem that compiles program code to generate compiled program code. In these embodiments, while compiling the program code, the processing subsystem first identifies a pointer in the program code that points to an unspecified address space. The processing subsystem then analyzes at least a portion of the program code to determine one or more address spaces to which the pointer may point. Next, the processor updates metadata for the pointer to indicate the one or more address spaces to which the pointer may point, the metadata enabling a determination of an address space to which the pointer points during subsequent execution of the compiled program code.
Abstract:
Some embodiments include a processing subsystem that compiles program code to generate compiled program code. In these embodiments, while compiling the program code, the processing subsystem first identifies a pointer in the program code that points to an unspecified address space. The processing subsystem then analyzes at least a portion of the program code to determine one or more address spaces to which the pointer may point. Next, the processor updates metadata for the pointer to indicate the one or more address spaces to which the pointer may point, the metadata enabling a determination of an address space to which the pointer points during subsequent execution of the compiled program code.