专利检索 ap:"Jerome F. Duluk" 第 1 页

1.

发明授权
Execution state analysis for assigning tasks to streaming multiprocessors 有权

公开(公告)号：US09715413B2

公开(公告)日：2017-07-25

申请号：US13353155

申请日：2012-01-18

申请人： Karim M. Abdalla , Lacky V. Shah , Jerome F. Duluk, Jr. , Timothy John Purcell , Tanmoy Mandal , Gentaro Hirota

发明人： Karim M. Abdalla , Lacky V. Shah , Jerome F. Duluk, Jr. , Timothy John Purcell , Tanmoy Mandal , Gentaro Hirota

IPC分类号： G06F9/46 , G06F13/00 , G06F9/50

CPC分类号： G06F9/505 , G06F2209/503

摘要： One embodiment of the present invention sets forth a technique for selecting a first processor included in a plurality of processors to receive work related to a compute task. The technique involves analyzing state data of each processor in the plurality of processors to identify one or more processors that have already been assigned one compute task and are eligible to receive work related to the one compute task, receiving, from each of the one or more processors identified as eligible, an availability value that indicates the capacity of the processor to receive new work, selecting a first processor to receive work related to the one compute task based on the availability values received from the one or more processors, and issuing, to the first processor via a cooperative thread array (CTA), the work related to the one compute task.

2.

发明授权
Inter-shader attribute buffer optimization 有权
标题翻译：内部着色器属性缓冲区优化

公开(公告)号：US08619087B2

公开(公告)日：2013-12-31

申请号：US12895579

申请日：2010-09-30

申请人： Jerome F. Duluk, Jr. , Gernot Schaufler

发明人： Jerome F. Duluk, Jr. , Gernot Schaufler

IPC分类号： G06T1/20

CPC分类号： G06T1/20 , G06F9/3851 , G06F9/3887 , G06T15/005 , G06T2210/04 , G06T2210/52

摘要： One embodiment of the present invention sets forth a technique for reducing the amount of memory required to store vertex data processed within a processing pipeline that includes a plurality of shading engines. The method includes determining a first active shading engine and a second active shading engine included within the processing pipeline, wherein the second active shading engine receives vertex data output by the first active shading engine. An output map is received and indicates one or more attributes that are included in the vertex data and output by the first active shading engine. An input map is received and indicates one or more attributes that are included in the vertex data and received by the second active shading engine from the first active shading engine. Then, a buffer map is generated based on the input map, the output map, and a pre-defined set of rules that includes rule data associated with both the first shading engine and the second shading engine, wherein the buffer map indicates one or more attributes that are included in the vertex data and stored in a memory that is accessible by both the first active shading engine and the second active shading engine.

摘要翻译： 本发明的一个实施例提出了一种用于减少存储在包括多个着色引擎的处理流水线中处理的顶点数据所需的存储量的技术。该方法包括确定包括在处理流水线内的第一主动着色引擎和第二主动着色引擎，其中第二主动着色引擎接收由第一主动着色引擎输出的顶点数据。接收输出图，并指示包含在顶点数据中并由第一主动着色引擎输出的一个或多个属性。接收输入图，并且指示包括在顶点数据中并由第二主动着色引擎从第一主动着色引擎接收的一个或多个属性。然后，基于输入映射，输出映射和包括与第一着色引擎和第二着色引擎相关联的规则数据的预定义的规则集合生成缓冲器映射，其中缓冲器映射指示一个或多个包括在顶点数据中并存储在可由第一主动着色引擎和第二主动着色引擎访问的存储器中的属性。

3.

发明申请
TECHNIQUE FOR COMPUTATIONAL NESTED PARALLELISM 有权
标题翻译：计算并行平行技术

公开(公告)号：US20130298133A1

公开(公告)日：2013-11-07

申请号：US13462649

申请日：2012-05-02

申请人： Stephen JONES , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk, JR. , Christopher Lamb

发明人： Stephen JONES , Philip Alexander Cuadra , Daniel Elliot Wexler , Ignacio Llamas , Lacky V. Shah , Jerome F. Duluk, JR. , Christopher Lamb

IPC分类号： G06F9/50

CPC分类号： G06F9/5027 , G06F9/522 , G06F2209/483 , G06T1/20

摘要： One embodiment of the present invention sets forth a technique for performing nested kernel execution within a parallel processing subsystem. The technique involves enabling a parent thread to launch a nested child grid on the parallel processing subsystem, and enabling the parent thread to perform a thread synchronization barrier on the child grid for proper execution semantics between the parent thread and the child grid. This technique advantageously enables the parallel processing subsystem to perform a richer set of programming constructs, such as conditionally executed and nested operations and externally defined library functions without the additional complexity of CPU involvement.

摘要翻译： 本发明的一个实施例提出了一种用于在并行处理子系统内执行嵌套的内核执行的技术。该技术涉及使父线程启动并行处理子系统上的嵌套子网格，并使父线程能够在子网格上执行线程同步屏障，以在父线程和子网格之间实现正确的执行语义。该技术有利地使得并行处理子系统能够执行更丰富的编程结构集合，诸如条件执行和嵌套操作以及外部定义的库函数，而不会增加CPU参与的复杂性。

4.

发明授权
System and method for utilizing semaphores in a graphics pipeline 有权
标题翻译：在图形管道中利用信号量的系统和方法

公开(公告)号：US08525842B1

公开(公告)日：2013-09-03

申请号：US11454389

申请日：2006-06-16

申请人： Jerome F. Duluk, Jr. , Richard A. Silkebakken

发明人： Jerome F. Duluk, Jr. , Richard A. Silkebakken

IPC分类号： G06T1/20

CPC分类号： G06T1/20 , G06F9/52 , G06F9/526 , G06T15/005

摘要： A semaphore system, method, and computer program product are provided for use in a graphics environment. In operation, a semaphore is operated upon utilizing a plurality of graphics processing modules for a variety of graphics processing-related purposes (e.g. for example, controlling access to graphics data by the graphics processing modules, etc.).

摘要翻译： 提供信号量系统，方法和计算机程序产品用于图形环境。在操作中，信号量在利用用于各种图形处理相关目的的多个图形处理模块（例如，通过图形处理模块等来控制对图形数据的访问）时被操作。

5.

发明申请
AUTOMATIC DEPENDENT TASK LAUNCH 审中-公开
标题翻译：自动相关任务启动

公开(公告)号：US20130198760A1

公开(公告)日：2013-08-01

申请号：US13360581

申请日：2012-01-27

申请人： Philip Alexander CUADRA , Lacky V. Shah , Timothy John Purcell , Gerald F. Luiz , Jerome F. Duluk, JR.

发明人： Philip Alexander CUADRA , Lacky V. Shah , Timothy John Purcell , Gerald F. Luiz , Jerome F. Duluk, JR.

IPC分类号： G06F9/46

CPC分类号： G06F9/4881 , G06F9/445 , G06F2209/484

摘要： One embodiment of the present invention sets forth a technique for automatic launching of a dependent task when execution of a first task completes. Automatically launching the dependent task reduces the latency incurred during the transition from the first task to the dependent task. Information associated with the dependent task is encoded as part of the metadata for the first task. When execution of the first task completes a task scheduling unit is notified and the dependent task is launched without requiring any release or acquisition of a semaphore. The information associated with the dependent task includes an enable flag and a pointer to the dependent task. Once the dependent task is launched, the first task is marked as complete so that memory storing the metadata for the first task may be reused to store metadata for a new task.

摘要翻译： 本发明的一个实施例提出了当执行第一任务完成时自动启动依赖任务的技术。自动启动从属任务可以减少在从第一个任务到从属任务的转换过程中产生的延迟。与依赖任务相关联的信息被编码为第一任务的元数据的一部分。当执行第一任务完成任务调度单元被通知并且从属任务被启动而不需要任何释放或获取信号量时。与从属任务相关联的信息包括使能标志和指向依赖任务的指针。一旦启动依赖任务，第一个任务被标记为完整的，以便存储第一个任务的元数据的内存可以被重新用于存储新任务的元数据。

6.

发明申请
COMPUTE TASK STATE ENCAPSULATION 审中-公开
标题翻译：计算机任务状态包络

公开(公告)号：US20130117751A1

公开(公告)日：2013-05-09

申请号：US13292951

申请日：2011-11-09

申请人： Jerome F. DULUK, JR. , Lacky V. SHAH , Sean J. TREICHLER

发明人： Jerome F. DULUK, JR. , Lacky V. SHAH , Sean J. TREICHLER

IPC分类号： G06F9/46

摘要： One embodiment of the present invention sets forth a technique for encapsulating compute task state that enables out-of-order scheduling and execution of the compute tasks. The scheduling circuitry organizes the compute tasks into groups based on priority levels. The compute tasks may then be selected for execution using different scheduling schemes. Each group is maintained as a linked list of pointers to compute tasks that are encoded as task metadata (TMD) stored in memory. A TMD encapsulates the state and parameters needed to initialize, schedule, and execute a compute task.

摘要翻译： 本发明的一个实施例提出了一种用于封装计算任务状态的技术，该计算任务状态实现计算任务的无序调度和执行。调度电路基于优先级将计算任务组织成组。然后可以使用不同的调度方案来选择计算任务来执行。维护每个组作为指针的链接列表，以计算任务被编码为存储在存储器中的任务元数据（TMD）。 TMD封装了初始化，调度和执行计算任务所需的状态和参数。

7.

发明授权
Method and system for connecting multiple shaders 有权
标题翻译：连接多个着色器的方法和系统

公开(公告)号：US08223158B1

公开(公告)日：2012-07-17

申请号：US11613018

申请日：2006-12-19

申请人： John Erik Lindholm , Michael C. Shebanow , Jerome F. Duluk, Jr.

发明人： John Erik Lindholm , Michael C. Shebanow , Jerome F. Duluk, Jr.

IPC分类号： G06T1/20

CPC分类号： G06T1/20

摘要： A method and system for connecting multiple shaders are disclosed. Specifically, one embodiment of the present invention sets forth a method, which includes the steps of configuring a set of shaders in a user-defined sequence within a modular pipeline (MPipe), allocating resources to execute the programming instructions of each of the set of shaders in the user-defined sequence to operate on the data unit, and directing the output of the MPipe to an external sink.

摘要翻译： 公开了一种用于连接多个着色器的方法和系统。具体地，本发明的一个实施例提出了一种方法，其包括以下步骤：在模块化流水线（MPipe）内以用户定义的序列配置一组着色器，分配资源以执行所述一组用户定义的序列中的着色器在数据单元上操作，并将MPipe的输出引导到外部接收器。

8.

发明申请
SPARSE TEXTURE SYSTEMS AND METHODS 有权

公开(公告)号：US20110157206A1

公开(公告)日：2011-06-30

申请号：US12651192

申请日：2009-12-31

申请人： Jerome F. Duluk, JR. , Andrew Tao , Bryon Nordquist , Henry Moreton

发明人： Jerome F. Duluk, JR. , Andrew Tao , Bryon Nordquist , Henry Moreton

IPC分类号： G09G5/00

CPC分类号： G06T5/001 , G06T3/4007 , G06T15/04

摘要： Systems and methods for texture processing are presented. In one embodiment a texture method includes creating a sparse texture residency translation map; performing a probe process utilizing the sparse texture residency translation map information to return a finest LOD that contains the texels for a texture lookup operation; and performing the texture lookup operation utilizing the finest LOD. In one exemplary implementation, the finest LOD is utilized as a minimum LOD clamp during the texture lookup operation. A finest LOD number indicates a minimum resident LOD and a sparse texture residency translation map includes one finest LOD number per tile of a sparse texture. The sparse texture residency translation can indicate a minimum resident LOD.

9.

发明申请
SPARSE TEXTURE SYSTEMS AND METHODS 有权
标题翻译：稀疏纹理系统和方法

公开(公告)号：US20110157205A1

公开(公告)日：2011-06-30

申请号：US12651141

申请日：2009-12-31

申请人： Andrew Tao , Jerome F. Duluk, JR. , Jesse D. Hall , Henry Moreton

发明人： Andrew Tao , Jerome F. Duluk, JR. , Jesse D. Hall , Henry Moreton

IPC分类号： G09G5/00

CPC分类号： G06T15/04

摘要： Systems and methods for texture processing are presented. In one embodiment a texture method includes creating a sparse texture residency translation map; performing a probe process utilizing the sparse texture residency translation map information to return a finest LOD that contains the texels for a texture lookup operation; and performing the texture lookup operation utilizing the finest LOD. In one exemplary implementation, the finest LOD is utilized as a minimum LOD clamp during the texture lookup operation. A finest LOD number indicates a minimum resident LOD and a sparse texture residency translation map includes one finest LOD number per tile of a sparse texture. The sparse texture residency translation can indicate a minimum resident LOD.

摘要翻译： 提出了纹理处理的系统和方法。在一个实施例中，纹理方法包括创建稀疏纹理驻留转换图; 使用稀疏纹理驻留转换映射信息来执行探测过程以返回包含用于纹理查找操作的纹素的最好的LOD; 并利用最好的LOD执行纹理查找操作。在一个示例性实现中，在纹理查找操作期间，最好的LOD用作最小LOD钳位。最好的LOD数字表示最小驻留LOD，稀疏纹理驻留转换映射包括稀疏纹理的每个瓷砖的最好的LOD数。稀疏纹理驻留翻译可以指示最小驻留LOD。

10.

发明申请
Methods to Facilitate Primitive Batching 有权
标题翻译：促进原始分批的方法

公开(公告)号：US20110080416A1

公开(公告)日：2011-04-07

申请号：US12898624

申请日：2010-10-05

申请人： Jerome F. Duluk, JR. , Thomas Roell , Patrick R. Brown

发明人： Jerome F. Duluk, JR. , Thomas Roell , Patrick R. Brown

IPC分类号： G06T1/20

CPC分类号： G06T1/20 , G06F9/3851 , G06F9/3887 , G06T15/005 , G06T2210/04 , G06T2210/52

摘要： One embodiment of the present invention sets forth a technique for splitting a set of vertices into a plurality of batches for processing. The method includes receiving one or more primitives each containing an associated set of vertices. For each of the one or more primitives, one or more vertices are gathered from the set of vertices, the vertices are arranged into one or more batches, the batch is routed to a processing pipeline line to process each batch as a separate primitive, and the one or more batches are processed to produce results identical to those of processing the entire primitive as a single entity.

摘要翻译： 本发明的一个实施例提出了一种用于将一组顶点分割成多个批次以进行处理的技术。该方法包括接收一个或多个每个包含相关联的顶点集的基元。对于一个或多个基元中的每一个，从顶点集合中收集一个或多个顶点，将顶点排列成一个或多个批次，批次被路由到处理流水线，以将每个批处理作为单独的原语处理，以及处理一个或多个批次以产生与作为单个实体处理整个原语的结果相同的结果。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类