Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for improving visibility generation in tile-based GPU architectures. A graphics processor may perform a first binning pass associated with visibility information for each of a plurality of primitives in at least one frame. The visibility information for each of the plurality of primitives may correspond to a visible indication or an invisible indication. The graphics processor may update a depth buffer based on the visibility information for all of the plurality of primitives in the at least one frame. The graphics processor may perform a second binning pass for each of the visible set of primitives based on the updated depth buffer. The graphics processor may store at least one of the updated visibility information or updated position data for all primitives in the visible set of primitives from the second binning pass.
Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for improving visibility generation in tile-based GPU architectures. A graphics processor may perform a first binning pass associated with visibility information for each of a plurality of primitives in at least one frame. The visibility information for each of the plurality of primitives may correspond to a visible indication or an invisible indication. The graphics processor may update a depth buffer based on the visibility information for all of the plurality of primitives in the at least one frame. The graphics processor may perform a second binning pass for each of the visible set of primitives based on the updated depth buffer. The graphics processor may store at least one of the updated visibility information or updated position data for all primitives in the visible set of primitives from the second binning pass.
Abstract:
The present disclosure relates to methods and apparatus for graphics processing, e.g., a GPU. The apparatus may receive an image including a plurality of pixels associated with one or more workgroups and one or more pixel tiles, each of the workgroups and the pixel tiles including one or more pixels of the plurality of pixels. The apparatus may determine whether the one or more workgroups are misaligned with the one or more pixel tiles. The apparatus may determine a conversion order of the one or more workgroups when the one or more workgroups are misaligned with the one or more pixel tiles, the conversion order corresponding to a common multiple of one of the one or more workgroups and one of the one or more pixel tiles. The apparatus may convert each of the one or more workgroups based on the conversion order of the one or more workgroups.
Abstract:
A method, an apparatus, and a computer-readable medium may be configured to perform a binning pass for a first frame. The apparatus may be configured to perform a rendering pass for the first frame in parallel with the binning pass. The apparatus may be configured to enhance efficiency in performing a binning pass and a rendering pass for tile-based rendering, such that the binning pass and rendering pass are performed concurrently. The apparatus may be configured to perform the binning pass using a first hardware pipeline, and may be configured to perform the rendering pass using a second hardware pipeline.
Abstract:
Some aspects of the disclosure include a self-refresh entry sequence for a memory, such as a DRAM, that may be used to avoid a frequency mismatch between a system processor and a system memory. The self-refresh entry sequence may signal the memory to reset the frequency set point state and default to the power-up state upon a self-refresh process exit. In another aspect, a new mode register may be used to indicate that the frequency set point needs to be reset after the next self-refresh entry command. In this aspect, the processor will execute a mode register write command followed by a self-refresh entry in response to the occurrence of a crash event. Then, the memory will reset to the default frequency set point by the end of self-refresh entry execution.
Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for backface culling for guard band clipping primitives. A graphics processor may identify at least one backface primitive in a set of primitives that extends beyond at least one guard band, where the at least one backface primitive is identified based on a set of fixed point coordinates. The graphics processor may cull the at least one backface primitive. The graphics processor may transmit an indication of the culled at least one backface primitive.
Abstract:
Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., GPU. The apparatus may obtain an indication of a set of primitives for a draw call operation. The apparatus may also identify a subset of primitives in the set of primitives, each of the subset of primitives including a primitive portion that is outside of a viewing frustum for the draw call operation, and the primitive portion corresponding to less than all of each of the subset of primitives. Further, the apparatus may calculate an area of each of the subset of primitives including the primitive portion that is outside of the viewing frustum. The apparatus may also perform, or refrain from performing, a clipping operation for each of the subset of primitives based on the area of each of the subset of primitives being less than or greater than an area threshold.
Abstract:
Aspects of the disclosure are directed to information processing. In accordance with one aspect, information processing includes a databus; a memory system coupled to the databus; and a graphics processing unit (GPU) coupled to the memory system and the databus, wherein the GPU is configured to do the following: retrieve a first plurality of atomic operations containing a first plurality of data values for a shared memory location; compute a first aggregate data value based on the first plurality of data values; and generate a first aggregate atomic operation containing the first aggregate data value.
Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for truncation error signaling and adaptive dither for lossy bandwidth compression. A processor may perform a truncation process for data, where the data is associated with display processing, image processing, or the data processing, where the truncation process for the data results in truncated data. The processor may compute a set of truncation error values associated with the truncation process for the truncated data. The processor may generate a set of residual samples for the truncated data. The processor may generate a bitstream based on the set of residual samples for the truncated data and the set of truncation error values associated with the truncation process.
Abstract:
Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a plurality of primitives associated with one or more frames in a scene, a portion of the scene being associated with an upscaled sample space and/or a downscaled sample space. The apparatus may also perform a binning pass for the plurality of primitives, the binning pass being associated with an unscaled sample space, where the binning pass sorts each of the primitives into one or more bins associated with each of the one or more frames. Further, the apparatus may perform one of one or more rendering passes for each of the one or more bins. The apparatus may also rasterize each of the plurality of primitives based on at least one of the upscaled sample space or the downscaled sample space.