Runtime mechanism to optimize shader execution flow

    公开(公告)号:US12229864B2

    公开(公告)日:2025-02-18

    申请号:US17817815

    申请日:2022-08-05

    Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for runtime optimization of the shader execution flow. A graphics processor may obtain instruction execution data associated with a graphics workload, the instruction execution data including graphics data for a set of shader operations. The graphics processor may configure, at a first iteration, at least one predication value based on the instruction execution data including the graphics data for the set of shader operations. The graphics processor may adjust, at a second iteration, an execution flow of the graphics workload based on the configured at least one predication value, the execution flow of the graphics workload including the set of shader operations. The graphics processor may execute or refrain from executing, at the second iteration, each of the set of shader operations based on the adjusted execution flow of the graphics workload.

    Run-time mechanism for optimal shader

    公开(公告)号:US12067666B2

    公开(公告)日:2024-08-20

    申请号:US17664033

    申请日:2022-05-18

    CPC classification number: G06T15/005 G06T1/60

    Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a set of draw call instructions corresponding to a graphics workload, where the set of draw call instructions is associated with at least one run-time parameter. The apparatus may also obtain a first shader program associated with storing data in a system memory and at least one second shader program associated with storing data in a constant memory. Further, the apparatus may execute the first shader program or the at least one second shader program based on whether the at least one run-time parameter is less than or equal to a size of the constant memory. The apparatus may also update or maintain a configuration of a shader processor or a streaming processor based on executing the first shader program or the at least one second shader program.

    Compatible compression for different types of image views

    公开(公告)号:US12008677B2

    公开(公告)日:2024-06-11

    申请号:US17655358

    申请日:2022-03-17

    CPC classification number: G06T1/20 H04N19/182

    Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for compatible compression for different types of image views. A graphics processor may select a first common format of a plurality of common formats for at least one image based on at least one of application data or first metadata associated with the at least one image. The graphics processor may encode the at least one image based on the selected first common format for the at least one image. The graphics processor may select a second common format for the at least one image based on second metadata of the at least one image. The second common format may be identical to the first common format. The graphics processor may decode the at least one image based on the selected second common format for the at least one image.

    Foveated binned rendering associated with sample spaces

    公开(公告)号:US11734787B2

    公开(公告)日:2023-08-22

    申请号:US17478694

    申请日:2021-09-17

    CPC classification number: G06T1/20 A63F13/525 G06T3/40 G06T11/40

    Abstract: Aspects presented herein relate to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a plurality of primitives associated with one or more frames in a scene, a portion of the scene being associated with an upscaled sample space and/or a downscaled sample space. The apparatus may also perform a binning pass for the plurality of primitives, the binning pass being associated with an unscaled sample space, where the binning pass sorts each of the primitives into one or more bins associated with each of the one or more frames. Further, the apparatus may perform one of one or more rendering passes for each of the one or more bins. The apparatus may also rasterize each of the plurality of primitives based on at least one of the upscaled sample space or the downscaled sample space.

    Methods and apparatus for intra-wave texture looping

    公开(公告)号:US11640647B2

    公开(公告)日:2023-05-02

    申请号:US17191439

    申请日:2021-03-03

    Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may determine whether to divide a group of threads into a plurality of sub-groups of threads, each thread of the group of threads being associated with a shader program. The apparatus may also divide, upon determining to divide the group of threads into the plurality of sub-groups of threads, the group of threads into the plurality of sub-groups of threads. Additionally, the apparatus may execute, upon dividing the group of threads into the plurality of sub-groups of threads, a subsection of the shader program for each sub-group of threads of the plurality of sub-groups of threads.

    Methods and apparatus for tensor object support in machine learning workloads

    公开(公告)号:US11481865B2

    公开(公告)日:2022-10-25

    申请号:US17173643

    申请日:2021-02-11

    Abstract: The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may modify at least one texture memory object to support a data structure for one or more tensor objects. The apparatus may also determine one or more supported memory layouts for the one or more tensor objects based on the modified at least one texture memory object. Additionally, the apparatus may access data associated with the one or more tensor objects based on the one or more supported memory layouts, the data for each of the one or more tensor objects corresponding to at least one data instruction. The apparatus may also execute the at least one data instruction based on the accessed data associated with the one or more tensor objects.

    Methods and apparatus to facilitate adaptive texture filtering

    公开(公告)号:US11257277B2

    公开(公告)日:2022-02-22

    申请号:US16912479

    申请日:2020-06-25

    Abstract: The present disclosure relates to methods and apparatus for graphics processing. In some aspects, the apparatus selects a first mip-map layer with a first texture size and a second mip-map layer with a second texture size based on a third texture size of an image. The apparatus also determines a relative distance associated with the texture sizes. Additionally, the apparatus determines a first quantity of samples to select from the first mip-map layer, and determines a second quantity of samples to select from the second mip-map layer, the second quantity of samples being less than the first quantity of samples, and a second quantity of filter taps being less than a first quantity of filter taps. Also, the apparatus generates the image at the third texture size through filtering based on the first quantity of samples and the second quantity of samples.

Patent Agency Ranking