Patent search ap:("Intel Corporation") AND inv:"Niraj Gupta" Page 1

1.

发明申请
MULTI-TENANCY PROTECTION FOR ACCELERATORS 有权

公开(公告)号：US20220311594A1

公开(公告)日：2022-09-29

申请号：US17569488

申请日：2022-01-05

Applicant: Intel Corporation

Inventor： Akshay Kadam , Sivakumar B , Lawrence Booth, JR. , Niraj Gupta , Steven Tu , Ricardo Becker , Subba Mungara , Tuyet-Trang Piel , Mitul Shah , Raynald Lim , Mihai Bogdan Bucsa , Cliodhna Ni Scanaill , Roman Zubarev , Dmitry Budnikov , Lingyun Zhu , Yi Qian , Stewart Taylor

IPC: H04L9/06 , H04L9/08 , G06F9/455

Abstract: An accelerator includes a memory, a compute zone to receive an encrypted workload downloaded from a tenant application running in a virtual machine on a host computing system attached to the accelerator, and a processor subsystem to execute a cryptographic key exchange protocol with the tenant application to derive a session key for the compute zone and to program the session key into the compute zone. The compute zone is to decrypt the encrypted workload using the session key, receive an encrypted data stream from the tenant application, decrypt the encrypted data stream using the session key, and process the decrypted data stream by executing the workload to produce metadata.

2.

发明授权
Using a global barrier to synchronize across local thread groups in general purpose programming on GPU 有权

公开(公告)号：US09916162B2

公开(公告)日：2018-03-13

申请号：US14563601

申请日：2014-12-08

Applicant: Intel Corporation

Inventor： Niraj Gupta

IPC: G06F15/80 , G06F9/38 , G06F9/48

CPC classification number: G06F9/3851 , G06F9/48

Abstract: Methods and systems may synchronize workloads across local thread groups. The methods and systems may provide for receiving, at a graphics processor, a workload from a host processor and receiving, at a plurality of processing elements, a plurality of threads that from one or more local thread groups. Additionally, the processing of the workload may be synchronized across the one or more thread groups. In one example, the global barrier determines that all threads across the thread groups have been completed without polling.

3.

发明授权
Systems, methods, and computer program products for performing mathematical operations 有权
Title translation: 用于执行数学运算的系统，方法和计算机程序产品

公开(公告)号：US09489342B2

公开(公告)日：2016-11-08

申请号：US14127178

申请日：2013-06-21

Applicant: Intel Corporation

Inventor： Niraj Gupta , Karthik N

IPC: G06F17/10 , G06F17/15 , G06F17/16

CPC classification number: G06F17/10 , G06F17/15 , G06F17/16

Abstract: The system has first, second, third, and fourth subsystems. Each subsystem has first and second multipliers coupled, respectively, to first and second adders. Each multiplier has two inputs. The first adder is coupled to a first output, a first accumulator, and a bit shifter. The bit shifter is coupled to a third adder. The third adder is coupled to a multiplexer. The multiplexer is coupled to a second output and a second accumulator. The second adder is coupled to the third adder and the multiplexer. The first outputs of the first and second subsystems are coupled directly to a fourth adder, the second outputs of the first and second subsystems are coupled directly to a fifth adder, the first outputs of the third and fourth subsystems are coupled directly to a sixth adder, and the second outputs of the third and fourth subsystems are coupled directly to a seventh adder.

Abstract translation: 该系统具有第一，第二，第三和第四子系统。每个子系统具有分别耦合到第一和第二加法器的第一和第二乘法器。每个乘法器有两个输入。第一加法器耦合到第一输出，第一累加器和位移位器。位移器耦合到第三加法器。第三加法器耦合到多路复用器。复用器耦合到第二输出和第二累加器。第二加法器耦合到第三加法器和多路复用器。第一和第二子系统的第一输出直接耦合到第四加法器，第一和第二子系统的第二输出直接耦合到第五加法器，第三和第四子系统的第一输出直接耦合到第六加法器并且第三和第四子系统的第二输出直接耦合到第七加法器。

4.

发明授权
Optimizing fixed point divide 有权
Title translation: 优化固定点分割

公开(公告)号：US09158498B2

公开(公告)日：2015-10-13

申请号：US13759274

申请日：2013-02-05

Applicant: INTEL CORPORATION

Inventor： Niraj Gupta

IPC: G06F7/52 , G06F7/535

CPC classification number: G06F7/535 , G06F2207/5354 , G06F2207/5356

Abstract: Systems, apparatus and methods are described related to optimizing fixed point divide.

Abstract translation: 描述了优化固定点分割的系统，装置和方法。

5.

发明申请
USING A GLOBAL BARRIER TO SYNCHRONIZE ACROSS LOCAL THREAD GROUPS IN GENERAL PURPOSE PROGRAMMING ON GPU 有权
Title translation: 使用全局障碍物同步在GPU上的一般目的编程中的本地螺纹组

公开(公告)号：US20150187042A1

公开(公告)日：2015-07-02

申请号：US14563601

申请日：2014-12-08

Applicant: Intel Corporation

Inventor： Niraj Gupta

IPC: G06T1/20 , G06F9/38

CPC classification number: G06F9/3851 , G06F9/48

Abstract: Methods and systems may synchronize workloads across local thread groups. The methods and systems may provide for receiving, at a graphics processor, a workload from a host processor and receiving, at a plurality of processing elements, a plurality of threads that from one or more local thread groups. Additionally, the processing of the workload may be synchronized across the one or more thread groups. In one example, the global barrier determines that all threads across the thread groups have been completed without polling.

Abstract translation: 方法和系统可以跨本地线程组同步工作负载。所述方法和系统可以提供在图形处理器处从主机处理器接收工作负载并且在多个处理元件处接收来自一个或多个本地线程组的多个线程。另外，工作负载的处理可以跨越一个或多个线程组同步。在一个示例中，全局障碍确定线程组中的所有线程都已完成，而无需轮询。

6.

发明授权
Techniques for connected component labeling 有权
Title translation: 连接组件标签技术

公开(公告)号：US09042652B2

公开(公告)日：2015-05-26

申请号：US13666913

申请日：2012-11-01

Applicant: Intel Corporation

Inventor： Niraj Gupta , Oren Agam , Benny Eitan , Mostafa Hagog

IPC: G06K9/34 , G06K9/46 , G06T7/00

CPC classification number: G06K9/4638 , G06T7/11 , G06T7/187 , G06T2200/28

Abstract: An apparatus may include a memory, a processor circuit, and a connected component labeling module. The connected component labeling module may be operative of the processor circuit to determine one or more connected components during reading of an image comprising a multiplicity of pixels from the memory, assign a label to a plurality of pixels of the multiplicity of pixels, generate one or more label connections for a respective one or more labels, each label connection linking a higher label to a lowest label for the same connected component, and write to the memory for each label of the one or more labels a lowest label as defined by the label connection for the each label after a label is assigned to each pixel.

Abstract translation: 装置可以包括存储器，处理器电路和连接的部件标签模块。连接的组件标注模块可操作于处理器电路，以在从存储器读取包括多个像素的图像的读取期间确定一个或多个连接的组件，将标签分配给多个像素的多个像素，生成一个或多个针对相应的一个或多个标签的更多标签连接，每个标签连接将较高标签链接到相同连接部件的最低标签，并且向该存储器写入一个或多个标签的每个标签，该标签由标签定义将标签分配给每个像素后，每个标签的连接。

7.

发明授权
Initiation of cache flushes and invalidations on graphics processors 有权
Title translation: 在图形处理器上启动缓存刷新和无效

公开(公告)号：US09563561B2

公开(公告)日：2017-02-07

申请号：US13926328

申请日：2013-06-25

Applicant: Intel Corporation

Inventor： Niraj Gupta , Hong Jiang

IPC: G06F12/08

CPC classification number: G06F12/0837 , G06F12/0808 , G06F2212/302

Abstract: Methods and systems may provide for receiving, at a graphics processor, a workload from a host processor and using a kernel on the graphics processor to issue a thread group for execution of the workload on the graphics processor. Additionally, one or more coherency messages may be initiated, by the graphics processor, in response to a thread-related condition of one or more caches on the graphics processor. In one example, the thread-related condition is associated with the execution of the workload on the graphics processor and indicates that the one or more caches on the graphics processor are not coherent with a system memory associated with the host processor.

Abstract translation: 方法和系统可以提供在图形处理器处接收来自主机处理器的工作负载并且使用图形处理器上的内核来发布用于在图形处理器上执行工作负载的线程组。另外，响应于图形处理器上的一个或多个高速缓存的线程相关状况，图形处理器可以启动一个或多个一致性消息。在一个示例中，线程相关条件与图形处理器上的工作负载的执行相关联，并且指示图形处理器上的一个或多个高速缓存与与主机处理器相关联的系统存储器不一致。

8.

发明申请
PARALLEL FLOOD-FILL TECHNIQUES AND ARCHITECTURE 审中-公开
Title translation: 并行浮法技术和建筑

公开(公告)号：US20150077422A1

公开(公告)日：2015-03-19

申请号：US14550214

申请日：2014-11-21

Applicant: INTEL CORPORATION

Inventor： Alon Gluska , Niraj Gupta , Mostafa Hagog , Dror Reif

IPC: G06T1/20

CPC classification number: G06T1/20

Abstract: Flood-fill techniques and architecture are disclosed. In accordance with one embodiment, the architecture comprises a hardware primitive with a software interface which collectively allow for both data-based and task-based parallelism in executing a flood-fill process. The hardware primitive is defined to do the flood-fill function and is scalable and may be implemented with a bitwise definition that can be tuned to meet power/performance targets, in some embodiments. In executing a flood-fill operation, and in accordance with an example embodiment, the software interface produces parallel threads and issues them to processing elements, such that each of the threads can run independently until done. Each processing element in turn accesses a flood-fill hardware primitive, each of which is configured to flood a seed inside an N×M image block. In some cases, processing element commands to the flood-fill hardware primitive(s) can be queued and acted upon pursuant to an arbitration scheme.

Abstract translation: 洪水填充技术和结构被公开。根据一个实施例，该架构包括具有软件接口的硬件原语，该软件接口在执行洪水填充处理时共同允许基于数据和基于任务的并行性。硬件原语被定义为执行洪水填充功能并且是可扩展的，并且可以在一些实施例中以可以调整以满足功率/性能目标的按位定义来实现。在执行洪水填充操作时，并且根据示例性实施例，软件接口产生并行线程并将其发布到处理元件，使得每个线程可以独立运行直到完成。每个处理元件依次访问洪水填充硬件图元，每个填充硬件图元被配置为在N×M图像块内淹没种子。在某些情况下，根据仲裁方案，可以对洪水填充硬件原语的处理单元命令进行排队和执行。

9.

发明授权
Parallel flood-fill techniques and architecture 有权
Title translation: 并行灌水技术和建筑

公开(公告)号：US08902238B2

公开(公告)日：2014-12-02

申请号：US13651854

申请日：2012-10-15

Applicant: Intel Corporation

Inventor： Alon Gluska , Niraj Gupta , Mostafa Hagog , Dror Reif

IPC: G06F15/80 , G09G5/02

CPC classification number: G06T1/20

Abstract: Flood-fill techniques and architecture are disclosed. In accordance with one embodiment, the architecture comprises a hardware primitive with a software interface which collectively allow for both data-based and task-based parallelism in executing a flood-fill process. The hardware primitive is defined to do the flood-fill function and is scalable and may be implemented with a bitwise definition that can be tuned to meet power/performance targets, in some embodiments. In executing a flood-fill operation, and in accordance with an example embodiment, the software interface produces parallel threads and issues them to processing elements, such that each of the threads can run independently until done. Each processing element in turn accesses a flood-fill hardware primitive, each of which is configured to flood a seed inside an N×M image block. In some cases, processing element commands to the flood-fill hardware primitive(s) can be queued and acted upon pursuant to an arbitration scheme.

Abstract translation: 洪水填充技术和结构被公开。根据一个实施例，该架构包括具有软件接口的硬件原语，该软件接口在执行洪水填充处理时共同允许基于数据和基于任务的并行性。硬件原语被定义为执行洪水填充功能并且是可扩展的，并且可以在一些实施例中以可以调整以满足功率/性能目标的按位定义来实现。在执行洪水填充操作时，并且根据示例性实施例，软件接口产生并行线程并将其发布到处理元件，使得每个线程可以独立运行直到完成。每个处理元件依次访问洪水填充硬件图元，每个填充硬件图元被配置为在N×M图像块内淹没种子。在某些情况下，根据仲裁方案，可以对洪水填充硬件原语的处理单元命令进行排队和执行。

10.

发明申请
MULTI-TENANCY PROTECTION FOR ACCELERATORS 有权

公开(公告)号：US20240396711A1

公开(公告)日：2024-11-28

申请号：US18785435

申请日：2024-07-26

Applicant: Intel Corporation

Inventor： Akshay Kadam , Sivakumar B , Lawrence Booth, JR. , Niraj Gupta , Steven Tu , Ricardo Becker , Subba Mungara , Tuyet-Trang Piel , Mitul Shah , Raynald Lim , Mihai Bogdan Bucsa , Cliodhna Ni Scanaill , Roman Zubarev , Dmitry Budnikov , Lingyun Zhu , Yi Qian , Stewart Taylor

IPC: H04L9/06 , G06F9/455 , H04L9/08

Abstract: An accelerator includes a memory, a compute zone to receive an encrypted workload downloaded from a tenant application running in a virtual machine on a host computing system attached to the accelerator, and a processor subsystem to execute a cryptographic key exchange protocol with the tenant application to derive a session key for the compute zone and to program the session key into the compute zone. The compute zone is to decrypt the encrypted workload using the session key, receive an encrypted data stream from the tenant application, decrypt the encrypted data stream using the session key, and process the decrypted data stream by executing the workload to produce metadata.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification