摘要:
This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.
摘要:
This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.
摘要:
This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.
摘要:
This disclosure is directed to deferred preemption techniques for scheduling graphics processing unit (GPU) command streams for execution on a GPU. A host CPU is described that is configured to control a GPU to perform deferred-preemption scheduling. For example, a host CPU may select one or more locations in a GPU command stream as being one or more locations at which preemption is allowed to occur in response to receiving a preemption notification, and may place one or more tokens in the GPU command stream based on the selected one or more locations. The tokens may indicate to the GPU that preemption is allowed to occur at the selected one or more locations. This disclosure further describes a GPU configured to preempt execution of a GPU command stream based on one or more tokens placed in a GPU command stream.
摘要:
This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.
摘要:
This disclosure proposes techniques for demand paging for an IO device (e.g., a GPU) that utilize pre-fetch and pre-back notification event signaling to reduce latency associated with demand paging. Page faults are limited by performing the demand paging operations prior to the IO device actually requesting unbacked memory.
摘要:
A first processing unit and a second processing unit can access a system memory that stores a common page table that is common to the first processing unit and the second processing unit. The common page table can store virtual memory addresses to physical memory addresses mapping for memory chunks accessed by a job of an application. A page entry, within the common page table, can include a first set of attribute bits that defines accessibility of the memory chunk by the first processing unit, a second set of attribute bits that defines accessibility of the same memory chunk by the second processing unit, and physical address bits that define a physical address of the memory chunk.
摘要:
A first processing unit and a second processing unit can access a system memory that stores a common page table that is common to the first processing unit and the second processing unit. The common page table can store virtual memory addresses to physical memory addresses mapping for memory chunks accessed by a job of an application. A page entry, within the common page table, can include a first set of attribute bits that defines accessibility of the memory chunk by the first processing unit, a second set of attribute bits that defines accessibility of the same memory chunk by the second processing unit, and physical address bits that define a physical address of the memory chunk.
摘要:
This disclosure is directed to deferred preemption techniques for scheduling graphics processing unit (GPU) command streams for execution on a GPU. A host CPU is described that is configured to control a GPU to perform deferred-preemption scheduling. For example, a host CPU may select one or more locations in a GPU command stream as being one or more locations at which preemption is allowed to occur in response to receiving a preemption notification, and may place one or more tokens in the GPU command stream based on the selected one or more locations. The tokens may indicate to the GPU that preemption is allowed to occur at the selected one or more locations. This disclosure further describes a GPU configured to preempt execution of a GPU command stream based on one or more tokens placed in a GPU command stream.
摘要:
This disclosure proposes techniques for demand paging for an IO device (e.g., a GPU) that utilize pre-fetch and pre-back notification event signaling to reduce latency associated with demand paging. Page faults are limited by performing the demand paging operations prior to the IO device actually requesting unbacked memory.