Automatic source code generation for accelerated function calls
    3.
    发明授权
    Automatic source code generation for accelerated function calls 有权
    用于加速函数调用的自动源代码生成

    公开(公告)号:US09501269B2

    公开(公告)日:2016-11-22

    申请号:US14501296

    申请日:2014-09-30

    CPC classification number: G06F8/447

    Abstract: A programming model for a processor accelerator allows accelerated functions to be called from a main program directly without a management API for the accelerator. A compiler automatically generates wrapper source code for each accelerator function called by the application source code. The wrapper code is compiled, together with the accelerator source code, to generate an object file that is linked to an object file for the main program. By automatically generating the wrapper code, a programmer can simply and directly invoke accelerator functions without the use of a complex management API. In addition, because the wrapper code for the accelerator is generated automatically, a standard compiler can be used to compile the main program, using standard linkage conventions.

    Abstract translation: 处理器加速器的编程模型允许从主程序直接调用加速函数,而不需要加速器的管理API。 编译器自动为应用程序源代码调用的每个加速器函数生成包装器源代码。 包装器代码与加速器源代码一起编译,以生成链接到主程序的对象文件的对象文件。 通过自动生成包装代码,程序员可以简单直接地调用加速器功能,而无需使用复杂的管理API。 另外,由于加速器的包装代码是自动生成的,因此可以使用标准编译器来编译主程序,使用标准的链接约定。

    LOAD BALANCING FOR HETEROGENEOUS SYSTEMS
    4.
    发明申请
    LOAD BALANCING FOR HETEROGENEOUS SYSTEMS 审中-公开
    用于异构系统的负载平衡

    公开(公告)号:US20130339978A1

    公开(公告)日:2013-12-19

    申请号:US13917484

    申请日:2013-06-13

    CPC classification number: G06F9/505 G06F9/5027 G06F9/5094 Y02D10/22

    Abstract: A method and an apparatus for performing load balancing in a heterogeneous computing system including a plurality of processing elements are presented. A program places tasks into a queue. A task from the queue is distributed to one of the plurality of processing elements, wherein the distributing includes the one processing element sending a task request to the queue and receiving a task to be done from the queue. The task is performed by the one processing element. A result of the task is sent from the one processing element to the program. The load balancing is performed by distributing tasks from the queue to processing elements that complete the tasks faster.

    Abstract translation: 提出了一种用于在包括多个处理元件的异构计算系统中执行负载平衡的方法和装置。 程序将任务放入队列。 来自队列的任务被分配到多个处理元件之一,其中分发包括一个处理元件向队列发送任务请求并从队列接收要完成的任务。 该任务由一个处理元件执行。 任务的结果从一个处理元件发送到程序。 通过将任务从队列分发到处理更快完成任务的元素来执行负载平衡。

    CACHING POLICIES FOR PROCESSING UNITS ON MULTIPLE SOCKETS

    公开(公告)号:US20170185514A1

    公开(公告)日:2017-06-29

    申请号:US14981833

    申请日:2015-12-28

    Abstract: A processing system includes a first socket, a second socket, and an interface between the first socket and the second socket. A first memory is associated with the first socket and a second memory is associated with the second socket. The processing system also includes a controller for the first memory. The controller is to receive a first request for a first memory transaction with the second memory and perform the first memory transaction along a path that includes the interface and bypasses at least one second cache associated with the second memory.

Patent Agency Ranking