摘要:
This disclosure describes techniques for extending the architecture of a general purpose graphics processing unit (GPGPU) with parallel processing units to allow efficient processing of pipeline-based applications. The techniques include configuring local memory buffers connected to parallel processing units operating as stages of a processing pipeline to hold data for transfer between the parallel processing units. The local memory buffers allow on-chip, low-power, direct data transfer between the parallel processing units. The local memory buffers may include hardware-based data flow control mechanisms to enable transfer of data between the parallel processing units. In this way, data may be passed directly from one parallel processing unit to the next parallel processing unit in the processing pipeline via the local memory buffers, in effect transforming the parallel processing units into a series of pipeline stages.
摘要:
This disclosure describes techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values. The techniques may include retrieving a destination alpha value from a bin buffer, the destination alpha value being generated in response to processing a first pixel associated with a first primitive. The techniques may further include determining, based on the destination alpha value, whether to perform an action that causes one or more texture values for a second pixel to not be retrieved from a texture buffer. In some examples, the action may include discarding the second pixel from a pixel processing pipeline prior to the second pixel arriving at a texture mapping stage of the pixel processing pipeline. The second pixel may be associated with a second primitive different than the first primitive.
摘要:
This disclosure describes techniques for selectively activating a resume check operation in a single instruction, multiple data (SIMD) processing system. A processor is described that is configured to selectively enable or disable a resume check operation for a particular instruction based on information included in the instruction that indicates whether a resume check operation is to be performed for the instruction. A compiler is also described that is configured to generate compiled code which, when executed, causes a resume check operation to be selectively enabled or disabled for particular instructions. The compiled code may include one or more instructions that each specify whether a resume check operation is to be performed for the respective instruction. The techniques of this disclosure may be used to reduce the power consumption of and/or improve the performance of a SIMD system that utilizes a resume check operation to manage the reactivation of deactivated threads.
摘要:
This disclosure is directed to deferred preemption techniques for scheduling graphics processing unit (GPU) command streams for execution on a GPU. A host CPU is described that is configured to control a GPU to perform deferred-preemption scheduling. For example, a host CPU may select one or more locations in a GPU command stream as being one or more locations at which preemption is allowed to occur in response to receiving a preemption notification, and may place one or more tokens in the GPU command stream based on the selected one or more locations. The tokens may indicate to the GPU that preemption is allowed to occur at the selected one or more locations. This disclosure further describes a GPU configured to preempt execution of a GPU command stream based on one or more tokens placed in a GPU command stream.
摘要:
This disclosure describes techniques for extending the architecture of a general purpose graphics processing unit (GPGPU) with parallel processing units to allow efficient processing of pipeline-based applications. The techniques include configuring local memory buffers connected to parallel processing units operating as stages of a processing pipeline to hold data for transfer between the parallel processing units. The local memory buffers allow on-chip, low-power, direct data transfer between the parallel processing units. The local memory buffers may include hardware-based data flow control mechanisms to enable transfer of data between the parallel processing units. In this way, data may be passed directly from one parallel processing unit to the next parallel processing unit in the processing pipeline via the local memory buffers, in effect transforming the parallel processing units into a series of pipeline stages.
摘要:
A graphics processing system comprises at least one memory device storing a plurality of pixel command threads and a plurality of vertex command threads. An arbiter coupled to the at least one memory device is provided that selects a command thread from either the plurality of pixel or vertex command threads based on relative priorities of the plurality of pixel command threads and the plurality of vertex command threads. The selected command thread is provided to a command processing engine capable of processing pixel command threads and vertex command threads.
摘要:
A method and apparatus for dynamic issuing of memory access instructions. In particular, a specific data access request that is about to be sent to a memory, such as a frame buffer, is dynamically chosen based upon pending requests within a pipeline. It is possible to optimize video data requests by dynamically selecting a memory access request at the time the request is made to the memory. In particular, if it is recognized that the memory about to be accessed will no longer be needed by subsequent memory requests, the request can be changed from a normal access request to an access request with an auto-close option. By using an auto close option, the memory bank being accessed is closed after the access, without issuing a separate memory close instruction.
摘要:
A video graphics chip includes a graphics module configured to process incoming video information in accordance with different modes to produce a video output signal and to transmit the video output signal toward a display screen for rendering of video corresponding to the video information, and a display mode module coupled to the graphics module configured to analyze the incoming video information to determine a type of video associated with the incoming video information and to send a video mode indication of a preferred video processing mode for the incoming video information to the graphics module, where the graphics module is configured to process the incoming video information in accordance with a selected mode from the plurality of different modes based on the video mode indication received from the display module.
摘要:
This disclosure describes techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values. The techniques may include retrieving a destination alpha value from a bin buffer, the destination alpha value being generated in response to processing a first pixel associated with a first primitive. The techniques may further include determining, based on the destination alpha value, whether to perform an action that causes one or more texture values for a second pixel to not be retrieved from a texture buffer. In some examples, the action may include discarding the second pixel from a pixel processing pipeline prior to the second pixel arriving at a texture mapping stage of the pixel processing pipeline. The second pixel may be associated with a second primitive different than the first primitive.
摘要:
A graphics processing system comprises at least one memory device storing a plurality of pixel command threads and a plurality of vertex command threads. An arbiter coupled to the at least one memory device is provided that selects a command thread from either the plurality of pixel or vertex command threads based on relative priorities of the plurality of pixel command threads and the plurality of vertex command threads. The selected command thread is provided to a command processing engine capable of processing pixel command threads and vertex command threads.