Resource Synchronization for Graphics Processing

    公开(公告)号:US20200167986A1

    公开(公告)日:2020-05-28

    申请号:US16707455

    申请日:2019-12-09

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to synchronizing access to pixel resources. Examples of pixel resources include color attachments, a stencil buffer, and a depth buffer. In some embodiments, hardware registers are used to track status of assigned pixel resources and pixel wait and pixel release instruction are used to synchronize access to the pixel resources. In some embodiments, other accesses to the pixel resources may occur out of program order. Relative to tracking and ordering pass groups, this weak ordering and explicit synchronization may improve performance and reduce power consumption. Disclosed techniques may also facilitate coordination between fragment rendering threads and auxiliary mid-render compute tasks.

    Accelerated Blits of Multisampled Textures on GPUs

    公开(公告)号:US20170358109A1

    公开(公告)日:2017-12-14

    申请号:US15179738

    申请日:2016-06-10

    Applicant: Apple Inc.

    CPC classification number: G06T1/60 G06T15/005

    Abstract: Systems, computer readable media, and methods for hardware accelerated blits of multisampled textures on graphics processing units (GPUs) are disclosed. For multisampled surfaces, texture-to-buffer blits cannot be trivially implemented because most GPUs do not support writing multisampled surfaces with a linear memory layout. Moreover, GPUs often have a maximum limit for row stride (i.e., the number of bytes from one row of pixels in memory to the next) and/or texture size. When the destination buffer for the blit of a multisampled texture is too large to be aliased by an equivalent non-multisampled texture view, the stride of the view has no spatial relationship with the destination buffer. Thus, to access the source texture correctly, a ‘remapping’ may be performed to determine the linear sample index of a fragment within the view, and the destination buffer stride may be used to compute the texture coordinates used to sample the source texture.

    Execution graph acceleration
    23.
    发明授权

    公开(公告)号:US11436055B2

    公开(公告)日:2022-09-06

    申请号:US16688487

    申请日:2019-11-19

    Applicant: Apple Inc.

    Abstract: A first command is fetched for execution on a GPU. Dependency information for the first command, which indicates a number of parent commands that the first command depends on, is determined. The first command is inserted into an execution graph based on the dependency information. The execution graph defines an order of execution for plural commands including the first command. The number of parent commands are configured to be executed on the GPU before executing the first command. A wait count for the first command, which indicates the number of parent commands of the first command, is determined based on the execution graph. The first command is inserted into cache memory in response to determining that the wait count for the first command is zero or that each of the number of parent commands the first command depends on has already been inserted into the cache memory.

    De-prioritization supporting frame buffer caching

    公开(公告)号:US11237967B2

    公开(公告)日:2022-02-01

    申请号:US16783766

    申请日:2020-02-06

    Applicant: Apple Inc.

    Abstract: Systems, methods, and computer readable media to manage memory cache for graphics processing are described. A processor creates a resource group for a plurality of graphics application program interface (API) resources. The processor subsequently encodes a set command that references the resource group within a command buffer and assigns a data set identifier (DSID) to the resource group. The processor also encodes a write command within the command buffer that causes the graphics processor to write data within a cache line and mark the written cache line with the DSID, a read command that causes the graphics processor to read data written into the resource group, and a de-prioritize command that causes the graphics processor to notify the memory cache to later flush content from the cache line associated with the DSID and to later invalidate the cache line when higher priority content is received.

    Bindpoint Emulation
    27.
    发明申请

    公开(公告)号:US20210097643A1

    公开(公告)日:2021-04-01

    申请号:US16786173

    申请日:2020-02-10

    Applicant: Apple Inc.

    Abstract: A computer-implemented technique for accessing textures by a graphics processing unit (GPU), includes determining a frequency with which a first texture is expected to be accessed by an application executing on a GPU, determining a frequency with which a second texture is expected to be accessed by an application executing on the GPU, determining to load memory address information associated with the first texture into a GPU register when the frequency is greater than or equal to a threshold frequency value, determining to load memory address information associated with the second texture into a buffer memory when the frequency is less than the threshold frequency value, receiving a draw call utilizing the texture, rendering the draw call using the first texture by accessing the memory address information in the GPU register, and the second texture by accessing the memory address information in the buffer memory.

    Graphics system and method for use of sparse textures

    公开(公告)号:US10896525B2

    公开(公告)日:2021-01-19

    申请号:US16428403

    申请日:2019-05-31

    Applicant: Apple Inc.

    Abstract: This disclosure includes example embodiments of graphics processor memory management systems that support the use of graphical textures that are not fully bound or “backed” in memory throughout their entire lifespans. Such graphical textures are referred to herein as “sparse textures.” According to some embodiments, sparse textures may be split into fixed-dimension pages in memory wherein, during execution, a user may indicate a desire to map certain pages to physical memory locations and populate such pages with the underlying data. In other embodiments, statistical information obtained from the graphics processor is used to aid in the determination of whether or not a given texture (or portion of a texture) needs physical memory backing. In yet other embodiments, the graphics processor may also enforce ordering guarantees, e.g., in instances when there are fewer pages in memory available than there is a need for backing of at a given moment in time.

    Indirect command buffers for graphics processing

    公开(公告)号:US10789756B2

    公开(公告)日:2020-09-29

    申请号:US16390654

    申请日:2019-04-22

    Applicant: Apple Inc.

    Abstract: Systems, methods, and computer readable media to encode and execute an indirect command buffer are described. A processor creates an indirect command buffer that is configured to be encoded into by a graphics processor at a later point in time. The processor encodes, within a command buffer, a produce command that references the indirect command buffer, where the produce command triggers execution on the graphics processor of a first operation that encodes a set of commands within the data structure. The processor also encodes, within the command buffer, a consume command that triggers execution on the graphics processor of a second operation that executes the set of commands encoded within the data structure. After encoding the command buffer, a processor commits the command buffer for execution on the graphics processor.

    Task Execution on a Graphics Processor Using Indirect Argument Buffers

    公开(公告)号:US20200242726A1

    公开(公告)日:2020-07-30

    申请号:US16850101

    申请日:2020-04-16

    Applicant: Apple Inc.

    Abstract: The disclosure pertains to techniques for operation of graphics systems and task execution on a graphics processor. One such technique comprises a computer-implemented method for task execution on a graphics processor, the method comprising creating a data structure for grouping data resources, populating the data structure with two or more data resources for encoding into a graphics processing language by an encoding object, passing the data structure to a first programming interface command, the first programming interface command configured to access the data structure's data resources, triggering execution of a first function on a graphics processer in response to passing the data structure to the first programming interface command, passing the data structure to a second programming interface command, the second programming interface command configured to access the data structure's data resources, and triggering execution of a second function on the graphics processer in response to passing the data structure to the second programming interface command.

Patent Agency Ranking