GRAPHICS PROCESSOR WITH ARITHMETIC AND ELEMENTARY FUNCTION UNITS
    11.
    发明申请
    GRAPHICS PROCESSOR WITH ARITHMETIC AND ELEMENTARY FUNCTION UNITS 审中-公开
    具有算术和元素功能单元的图形处理器

    公开(公告)号:US20150022534A1

    公开(公告)日:2015-01-22

    申请号:US14506959

    申请日:2014-10-06

    CPC classification number: G06T1/20 G06F9/30167 G06F9/383 G06F9/3851 G06F9/3885

    Abstract: A graphics processor capable of efficiently performing arithmetic operations and computing elementary functions is described. The graphics processor has at least one arithmetic logic unit (ALU) that can perform arithmetic operations and at least one elementary function unit that can compute elementary functions. The ALU(s) and elementary function unit(s) may be arranged such that they can operate in parallel to improve throughput. The graphics processor may also include fewer elementary function units than ALUs, e.g., four ALUs and a single elementary function unit. The four ALUs may perform an arithmetic operation on (1) four components of an attribute for one pixel or (2) one component of an attribute for four pixels. The single elementary function unit may operate on one component of one pixel at a time. The use of a single elementary function unit may reduce cost while still providing good performance.

    Abstract translation: 描述能够有效执行算术运算和计算基本功能的图形处理器。 图形处理器具有至少一个可执行算术运算的算术逻辑单元(ALU)和至少一个可以计算基本功能的基本功能单元。 ALU和基本功能单元可以被布置成使得它们可以并行操作以提高吞吐量。 图形处理器还可以包括比ALU更少的基本功能单元,例如四个ALU和单个基本功能单元。 四个ALU可以对(1)四个像素的属性的四个分量或(2)四个像素的属性的一个分量执行算术运算。 单个基本功能单元可以一次操作一个像素的一个分量。 使用单个基本功能单元可以降低成本,同时仍然提供良好的性能。

    General purpose register allocation in streaming processor

    公开(公告)号:US10558460B2

    公开(公告)日:2020-02-11

    申请号:US15379195

    申请日:2016-12-14

    Abstract: Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.

    Cache memory system and method using dynamically allocated dirty mask space
    15.
    发明授权
    Cache memory system and method using dynamically allocated dirty mask space 有权
    缓存内存系统和方法使用动态分配的脏屏蔽空间

    公开(公告)号:US09342461B2

    公开(公告)日:2016-05-17

    申请号:US13687761

    申请日:2012-11-28

    Abstract: A cache memory system includes a cache memory including a plurality of cache memory lines and a dirty buffer including a plurality of dirty masks. A cache controller is configured to allocate one of the dirty masks to each of the cache memory lines when a write to the respective cache memory line is not a full write to that cache memory line. Each of the dirty masks indicates dirty states of data units in one of the cache memory lines. The cache controller may include a dirty buffer index which stores an identification (ID) information that associates the dirty masks with the cache memory lines to which the dirty masks are allocated. A cache line may include a fully dirty flag indicating when each byte in that cache line is dirty, so that a dirty mask does not need to be allocated for that cache line.

    Abstract translation: 高速缓冲存储器系统包括包括多个高速缓存存储器线的高速缓冲存储器和包括多个脏掩模的脏缓冲器。 高速缓存控制器被配置为当对相应高速缓存存储器线的写入不是对该高速缓存存储器线的完全写入时,将一个脏掩模分配给每个高速缓存存储器线。 每个脏屏蔽指示一个缓存存储器线中的数据单元的脏状态。 高速缓存控制器可以包括脏缓冲器索引,该脏缓冲器索引存储将脏掩码与分配有脏掩码的高速缓冲存储器线相关联的标识(ID)信息。 高速缓存行可以包括完全脏标志,指示该高速缓存行中的每个字节何时是脏的,从而不需要为该高速缓存行分配脏掩码。

    SKIPPING OF DATA STORAGE
    16.
    发明申请
    SKIPPING OF DATA STORAGE 有权
    数据存储的移动

    公开(公告)号:US20160054998A1

    公开(公告)日:2016-02-25

    申请号:US14462932

    申请日:2014-08-19

    Abstract: Techniques are described in which an indication is included to indicate a last use of an intermediate value generated as part of determining a final value is not be stored in a general purpose register (GPR). A processing unit avoids storing the intermediate value in the GPR based on the indication because the intermediate value is no longer needed for determining the final value.

    Abstract translation: 描述了其中包括指示以指示作为确定最终值的一部分而生成的中间值的最后使用的指示不被存储在通用寄存器(GPR)中的技术。 处理单元基于指示,避免将中间值存储在GPR中,因为不再需要中间值来确定最终值。

    MEMORY MANAGEMENT USING DYNAMICALLY ALLOCATED DIRTY MASK SPACE
    17.
    发明申请
    MEMORY MANAGEMENT USING DYNAMICALLY ALLOCATED DIRTY MASK SPACE 有权
    使用动态分配的真皮掩蔽空间进行记忆管理

    公开(公告)号:US20140149685A1

    公开(公告)日:2014-05-29

    申请号:US13687761

    申请日:2012-11-28

    Abstract: Systems and methods related to a memory system including a cache memory are disclosed. The cache memory system includes a cache memory including a plurality of cache memory lines and a dirty buffer including a plurality of dirty masks. A cache controller is configured to allocate one of the dirty masks to each of the cache memory lines when a write to the respective cache memory line is not a full write to that cache memory line. Each of the dirty masks indicates dirty states of data units in one of the cache memory lines. The cache controller stores an identification (ID) information that associates the dirty masks with the cache memory lines to which the dirty masks are allocated.

    Abstract translation: 公开了与包括高速缓冲存储器的存储器系统有关的系统和方法。 高速缓冲存储器系统包括包括多个高速缓存存储器线的高速缓存存储器和包括多个脏掩模的脏缓冲器。 高速缓存控制器被配置为当对相应高速缓存存储器线的写入不是对该高速缓存存储器线的完全写入时,将一个脏掩模分配给每个高速缓存存储器线。 每个脏屏蔽指示一个缓存存储器线中的数据单元的脏状态。 高速缓存控制器存储将脏屏蔽与分配有脏屏蔽的高速缓冲存储器线相关联的标识(ID)信息。

    Dynamic wave pairing
    19.
    发明授权

    公开(公告)号:US11954758B2

    公开(公告)日:2024-04-09

    申请号:US17652478

    申请日:2022-02-24

    CPC classification number: G06T1/20 G06F9/505

    Abstract: This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for dynamic wave pairing. A graphics processor may allocate one or more GPU workloads to one or more wave slots of a plurality of wave slots. The graphics processor may select a first execution slot of a plurality of execution slots for executing the one or more GPU workloads. The selection may be based on one of a plurality of granularities. The graphics processor may execute, at the selected first execution slot, the one or more GPU workloads at the one of the plurality of granularities.

Patent Agency Ranking