Hardware-managed virtual buffers using a shared memory for load distribution
    1.
    发明授权
    Hardware-managed virtual buffers using a shared memory for load distribution 有权
    使用共享内存进行硬件管理的虚拟缓冲区进行负载分配

    公开(公告)号:US08760460B1

    公开(公告)日:2014-06-24

    申请号:US12773712

    申请日:2010-05-04

    IPC分类号: G06T1/60

    CPC分类号: G06T1/60

    摘要: One embodiment of the present invention sets forth a technique for using a shared memory to store hardware-managed virtual buffers. A circular buffer is allocated within a general-purpose multi-use cache for storage of primitive attribute data rather than having a dedicated buffer for the storage of the primitive attribute data. The general-purpose multi-use cache is also configured to store other graphics data sinces the space requirement for primitive attribute data storage is highly variable, depending on the number of attributes and the size of primitives. Entries in the circular buffer are allocated as needed and released and invalidated after the primitive attribute data has been consumed. An address to the circular buffer entry is transmitted along with primitive descriptors from object-space processing to the distributed processing in screen-space.

    摘要翻译: 本发明的一个实施例提出了一种使用共享存储器来存储硬件管理的虚拟缓冲器的技术。 在通用多用途高速缓存中分配循环缓冲器以存储原始属性数据,而不是具有用于存储原始属性数据的专用缓冲器。 通用多用途缓存还被配置为存储其他图形数据,对于原始属性数据存储的空间要求是高度可变的,这取决于属性的数量和图元的大小。 循环缓冲区中的条目根据需要进行分配,并在原始属性数据被消耗后被释放和无效。 循环缓冲区条目的地址与原始描述符一起从对象空间处理传输到屏幕空间中的分布式处理。

    Shader Program Headers
    2.
    发明申请
    Shader Program Headers 有权
    着色器程序标题

    公开(公告)号:US20110084976A1

    公开(公告)日:2011-04-14

    申请号:US12899431

    申请日:2010-10-06

    IPC分类号: G06T15/80

    CPC分类号: G06T15/005

    摘要: One embodiment of the present invention sets forth a technique for configuring a graphics processing pipeline (GPP) to process data according to one or more shader programs. The method includes receiving a plurality of pointers, where each pointer references a different shader program header (SPH) included in a plurality of SPHs, and each SPH is associated with a different shader program that executes within the GPP. For each SPH included in the plurality of SPHs, one or more GPP configuration parameters included in the SPH are identified, and the GPP is adjusted based on the one or more GPP configuration parameters.

    摘要翻译: 本发明的一个实施例提出了一种用于配置图形处理流水线(GPP)以根据一个或多个着色器程序处理数据的技术。 该方法包括接收多个指针,其中每个指针引用包括在多个SPH中的不同着色器程序头(SPH),并且每个SPH与在GPP内执行的不同着色器程序相关联。 对于包括在多个SPH中的每个SPH,识别包括在SPH中的一个或多个GPP配置参数,并且基于一个或多个GPP配置参数来调整GPP。

    Shader program headers
    3.
    发明授权
    Shader program headers 有权
    着色器程序标题

    公开(公告)号:US08786618B2

    公开(公告)日:2014-07-22

    申请号:US12899431

    申请日:2010-10-06

    IPC分类号: G06T1/00

    CPC分类号: G06T15/005

    摘要: One embodiment of the present invention sets forth a technique for configuring a graphics processing pipeline (GPP) to process data according to one or more shader programs. The method includes receiving a plurality of pointers, where each pointer references a different shader program header (SPH) included in a plurality of SPHs, and each SPH is associated with a different shader program that executes within the GPP. For each SPH included in the plurality of SPHs, one or more GPP configuration parameters included in the SPH are identified, and the GPP is adjusted based on the one or more GPP configuration parameters.

    摘要翻译: 本发明的一个实施例提出了一种用于配置图形处理流水线(GPP)以根据一个或多个着色器程序处理数据的技术。 该方法包括接收多个指针,其中每个指针引用包括在多个SPH中的不同着色器程序头(SPH),并且每个SPH与在GPP内执行的不同着色器程序相关联。 对于包括在多个SPH中的每个SPH,识别包括在SPH中的一个或多个GPP配置参数,并且基于一个或多个GPP配置参数来调整GPP。

    INTER-SHADER ATTRIBUTE BUFFER OPTIMIZATION
    5.
    发明申请
    INTER-SHADER ATTRIBUTE BUFFER OPTIMIZATION 有权
    INTER-SHADER属性缓存优化

    公开(公告)号:US20110080415A1

    公开(公告)日:2011-04-07

    申请号:US12895579

    申请日:2010-09-30

    IPC分类号: G06T1/20

    摘要: One embodiment of the present invention sets forth a technique for reducing the amount of memory required to store vertex data processed within a processing pipeline that includes a plurality of shading engines. The method includes determining a first active shading engine and a second active shading engine included within the processing pipeline, wherein the second active shading engine receives vertex data output by the first active shading engine. An output map is received and indicates one or more attributes that are included in the vertex data and output by the first active shading engine. An input map is received and indicates one or more attributes that are included in the vertex data and received by the second active shading engine from the first active shading engine. Then, a buffer map is generated based on the input map, the output map, and a pre-defined set of rules that includes rule data associated with both the first shading engine and the second shading engine, wherein the buffer map indicates one or more attributes that are included in the vertex data and stored in a memory that is accessible by both the first active shading engine and the second active shading engine.

    摘要翻译: 本发明的一个实施例提出了一种用于减少存储在包括多个着色引擎的处理流水线中处理的顶点数据所需的存储量的技术。 该方法包括确定包括在处理流水线内的第一主动着色引擎和第二主动着色引擎,其中第二主动着色引擎接收由第一主动着色引擎输出的顶点数据。 接收输出图,并指示包含在顶点数据中并由第一主动着色引擎输出的一个或多个属性。 接收输入图,并且指示包括在顶点数据中并由第二主动着色引擎从第一主动着色引擎接收的一个或多个属性。 然后,基于输入映射,输出映射和包括与第一着色引擎和第二着色引擎相关联的规则数据的预定义规则集来生成缓冲器映射,其中缓冲器映射指示一个或多个 包括在顶点数据中并存储在可由第一主动着色引擎和第二主动着色引擎访问的存储器中的属性。

    Inter-shader attribute buffer optimization
    6.
    发明授权
    Inter-shader attribute buffer optimization 有权
    内部着色器属性缓冲区优化

    公开(公告)号:US08619087B2

    公开(公告)日:2013-12-31

    申请号:US12895579

    申请日:2010-09-30

    IPC分类号: G06T1/20

    摘要: One embodiment of the present invention sets forth a technique for reducing the amount of memory required to store vertex data processed within a processing pipeline that includes a plurality of shading engines. The method includes determining a first active shading engine and a second active shading engine included within the processing pipeline, wherein the second active shading engine receives vertex data output by the first active shading engine. An output map is received and indicates one or more attributes that are included in the vertex data and output by the first active shading engine. An input map is received and indicates one or more attributes that are included in the vertex data and received by the second active shading engine from the first active shading engine. Then, a buffer map is generated based on the input map, the output map, and a pre-defined set of rules that includes rule data associated with both the first shading engine and the second shading engine, wherein the buffer map indicates one or more attributes that are included in the vertex data and stored in a memory that is accessible by both the first active shading engine and the second active shading engine.

    摘要翻译: 本发明的一个实施例提出了一种用于减少存储在包括多个着色引擎的处理流水线中处理的顶点数据所需的存储量的技术。 该方法包括确定包括在处理流水线内的第一主动着色引擎和第二主动着色引擎,其中第二主动着色引擎接收由第一主动着色引擎输出的顶点数据。 接收输出图,并指示包含在顶点数据中并由第一主动着色引擎输出的一个或多个属性。 接收输入图,并且指示包括在顶点数据中并由第二主动着色引擎从第一主动着色引擎接收的一个或多个属性。 然后,基于输入映射,输出映射和包括与第一着色引擎和第二着色引擎相关联的规则数据的预定义的规则集合生成缓冲器映射,其中缓冲器映射指示一个或多个 包括在顶点数据中并存储在可由第一主动着色引擎和第二主动着色引擎访问的存储器中的属性。

    COMPUTE THREAD ARRAY GRANULARITY EXECUTION PREEMPTION
    7.
    发明申请
    COMPUTE THREAD ARRAY GRANULARITY EXECUTION PREEMPTION 审中-公开
    计算机螺旋桨阵列精度执行预警

    公开(公告)号:US20130132711A1

    公开(公告)日:2013-05-23

    申请号:US13302962

    申请日:2011-11-22

    IPC分类号: G06F9/38

    CPC分类号: G06F9/461

    摘要: One embodiment of the present invention sets forth a technique instruction level and compute thread array granularity execution preemption. Preempting at the instruction level does not require any draining of the processing pipeline. No new instructions are issued and the context state is unloaded from the processing pipeline. When preemption is performed at a compute thread array boundary, the amount of context state to be stored is reduced because execution units within the processing pipeline complete execution of in-flight instructions and become idle. If, the amount of time needed to complete execution of the in-flight instructions exceeds a threshold, then the preemption may dynamically change to be performed at the instruction level instead of at compute thread array granularity.

    摘要翻译: 本发明的一个实施例阐述了技术指令级别和计算线程数组粒度执行抢占。 在指令级别抢占不需要处理管道的任何排水。 不会发出新的指令,并且从处理流水线中卸载上下文状态。 当在计算线程数组边界执行抢占时,由于处理流程内的执行单元完成飞行中指令的执行并变为空闲,因此减少了要存储的上下文状态量。 如果完成执行飞行中指令所需的时间超过阈值,则抢占可以动态地改变以在指令级别而不是以计算线程数组粒度来执行。