Abstract:
This disclosure describes techniques for performing hierarchical z-culling in a graphics processing system. In some examples, the techniques for performing hierarchical z-culling may involve selectively merging partially-covered source tiles for a tile location into a fully-covered merged source tile based on whether conservative farthest z-values for the partially-covered source tiles are nearer than a culling z-value for the tile location, and using a conservative farthest z-value associated with the fully-covered merged source tile to update the culling z-value for the tile location. In further examples, the techniques for performing hierarchical z-culling may use a cache unit that is not associated with an underlying memory to store conservative farthest z-values and coverage masks for merged source tiles. The capacity of the cache unit may be smaller than the size of cache needed to store merged source tile data for all of the tile locations in a render target.
Abstract:
Methods, systems, and devices for graphics processing are described. A device may receive an image including a set of pixels. The device may render a first subset of pixels in each bin of a set of bins during a first rendering pass, and defer rendering a second subset of pixels and a third subset of pixels in each bin of the set of bins during the first rendering pass. The second subset of pixels may include edge pixels and the third subset of pixels may be between the first subset of pixels and the second subset of pixels. The device may render the second subset of pixels and the third subset of pixels in each bin of the set of bins during a second rendering pass based on rendering the first subset of pixels. The device may then output the image based on the first and second rendering pass.
Abstract:
A render output unit running on at least one processor may receive a source pixel value to be written to a pixel location in a render target, wherein the source pixel value is associated with a source node in a hierarchical structure. The render output unit may receive a destination pixel value of the pixel location in the render target, wherein the destination pixel value is associated with a destination node in the hierarchical structure. The render output unit may determine a lowest common ancestor node of the source node and the destination node in the hierarchical structure. The render output unit may output a resulting pixel value associated with the lowest common ancestor node of the source node and the destination node to the pixel location in the render target.
Abstract:
A graphics processing unit (GPU) may determine a workload of a fragment shader program that executes on the GPU. The GPU may compare the workload of the fragment shader program to a threshold. In response to determining that the workload of the fragment shader program is lower than a specified threshold, the fragment shader program may process one or more fragments without the GPU performing early depth testing of the one or more fragments before the processing by the fragment shader program. The GPU may perform, after processing by the fragment shader program, late depth testing of the one or more fragments to result in one or more non-occluded fragments. The GPU may write pixel values for the one or more non-occluded fragments into a frame buffer.
Abstract:
This disclosure describes a method for performing conservative rasterization in a processor comprising determining vertices of a primitive, defining edges of the primitive by determining a set of edge equations based on the determined vertices, wherein the edge equations are based on an edge shifting parameter plus an offset, determining pixels that touch the edges of the primitive using the determined edge equations, and rasterizing the primitive using the determined pixels.