Abstract:
A method and system for sharing memory between a central processing unit (CPU) and a graphics processing unit (GPU) of a computing device are disclosed herein. The method includes allocating a surface within a physical memory and mapping the surface to a plurality of virtual memory addresses within a CPU page table. The method also includes mapping the surface to a plurality of graphics virtual memory addresses within an I/O device page table.
Abstract:
A method and system for sharing memory between a central processing unit (CPU) and a graphics processing unit (GPU) of a computing device are disclosed herein. The method includes allocating a surface within a physical memory and mapping the surface to a plurality of virtual memory addresses within a CPU page table. The method also includes mapping the surface to a plurality of graphics virtual memory addresses within an 110 device page table.
Abstract:
A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.
Abstract:
A method and apparatus to facilitate shared pointers in a heterogeneous platform. In one embodiment of the invention, the heterogeneous or non-homogeneous platform includes, but is not limited to, a central processing core or unit, a graphics processing core or unit, a digital signal processor, an interface module, and any other form of processing cores. The heterogeneous platform has logic to facilitate sharing of pointers to a location of a memory shared by the CPU and the GPU. By sharing pointers in the heterogeneous platform, the data or information sharing between different cores in the heterogeneous platform can be simplified.
Abstract:
A method and system are described herein for an optimization technique on two aspects of thread scheduling and dispatch when the driver is allowed to pick the scheduling attributes. The present techniques rely on an enhanced GPGPU Walker hardware command and one dimensional local identification generation to maximize thread residency.
Abstract:
A mechanism is described for facilitating dynamic pipelining of workload executions at graphics processing units on computing devices. A method of embodiments, as described herein, includes generating a command buffer having a plurality of kernels relating to a plurality of workloads to be executed at a graphics processing unit (GPU), and pipelining the workloads to be processed at the GPU, where pipelining includes scheduling each kernel to be executed on the GPU based on at least one of availability of resource threads and status of one or more dependency events relating to each kernel in relation to other kernels of the plurality of kernels.
Abstract:
According to some embodiments, a graphics processor may abort a workload without requiring changes to the kernel code compilation or intruding upon graphics processing unit execution. Instead, it is possible to only read the predicate state once before starting and once before restarting a workload that has been preempted because the user wishes to abort the work. This avoids the need to read from each execution unit, reducing the drain on memory bandwidth and increasing power and performance in some embodiments.
Abstract:
A method and apparatus to facilitate shared pointers in a heterogeneous platform. In one embodiment of the invention, the heterogeneous or non-homogeneous platform includes, but is not limited to, a central processing core or unit, a graphics processing core or unit, a digital signal processor, an interface module, and any other form of processing cores. The heterogeneous platform has logic to facilitate sharing of pointers to a location of a memory shared by the CPU and the GPU. By sharing pointers in the heterogeneous platform, the data or information sharing between different cores in the heterogeneous platform can be simplified.
Abstract:
A method and system for sharing memory between a central processing unit (CPU) and a graphics processing unit (GPU) of a computing device are disclosed herein. The method includes allocating a surface within a physical memory and mapping the surface to a plurality of virtual memory addresses within a CPU page table. The method also includes mapping the surface to a plurality of graphics virtual memory addresses within an I/O device page table.
Abstract:
A method and system for shared virtual memory between a central processing unit (CPU) and a graphics processing unit (GPU) of a computing device are disclosed herein. The method includes allocating a surface within a system memory. A CPU virtual address space may be created, and the surface may be mapped to the CPU virtual address space within a CPU page table. The method also includes creating a GPU virtual address space equivalent to the CPU virtual address space, mapping the surface to the GPU virtual address space within a GPU page table, and pinning the surface.