Patent search ap:("Intel Corporation") AND inv:"Gautham Chinya" Page 2

11.

发明申请
AREA AND ENERGY EFFICIENT MULTI-PRECISION MULTIPLY-ACCUMULATE UNIT-BASED PROCESSOR 有权

公开(公告)号：US20210397414A1

公开(公告)日：2021-12-23

申请号：US17358868

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Arnab Raha , Mark A. Anders , Martin Power , Martin Langhammer , Himanshu Kaul , Debabrata Mohapatra , Gautham Chinya , Cormac Brick , Ram Krishnamurthy

IPC: G06F7/544 , G06F7/527 , G06F5/01

Abstract: Systems, apparatuses and methods may provide for multi-precision multiply-accumulate (MAC) technology that includes a plurality of arithmetic blocks, wherein the plurality of arithmetic blocks each contain multiple multipliers, and wherein the logic is to combine multipliers one or more of within each arithmetic block or across multiple arithmetic blocks. In one example, one or more intermediate multipliers are of a size that is less than precisions supported by arithmetic blocks containing the one or more intermediate multipliers.

12.

发明申请
METHODS AND APPARATUS TO LOAD DATA WITHIN A MACHINE LEARNING ACCELERATOR 有权

公开(公告)号：US20210326144A1

公开(公告)日：2021-10-21

申请号：US17359392

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Arnab Raha , Deepak Mathaikutty , Debabrata Mohapatra , Sang Kyun Kim , Gautham Chinya , Cormac Brick

IPC: G06F9/445 , G06F9/50 , G06F9/30 , G06N20/00 , H03K19/177 , H03K19/20

Abstract: Methods, apparatus, systems, and articles of manufacture to load data into an accelerator are disclosed. An example apparatus includes data provider circuitry to load a first section and an additional amount of compressed machine learning parameter data into a processor engine. Processor engine circuitry executes a machine learning operation using the first section of compressed machine learning parameter data. A compressed local data re-user circuitry determines if a second section is present in the additional amount of compressed machine learning parameter data. The processor engine circuitry executes a machine learning operation using the second section when the second section is present in the additional amount of compressed machine learning parameter data.

13.

发明申请
METHODS, SYSTEMS, ARTICLES OF MANUFACTURE, AND APPARATUS TO DECODE ZERO-VALUE-COMPRESSION DATA VECTORS 审中-公开

公开(公告)号：US20200228137A1

公开(公告)日：2020-07-16

申请号：US16832804

申请日：2020-03-27

Applicant: Intel Corporation

Inventor： Gautham Chinya , Debabrata Mohapatra , Arnab Raha , Huichu Liu , Cormac Brick

IPC: H03M7/30 , G06F16/22 , G06N3/063

Abstract: Methods, systems, articles of manufacture, and apparatus are disclosed to decode zero-value-compression data vectors. An example apparatus includes: a buffer monitor to monitor a buffer for a header including a value indicative of compressed data; a data controller to, when the buffer includes compressed data, determine a first value of a sparse select signal based on (1) a select signal and (2) a first position in a sparsity bitmap, the first value of the sparse select signal corresponding to a processing element that is to process a portion of the compressed data; and a write controller to, when the buffer includes compressed data, determine a second value of a write enable signal based on (1) the select signal and (2) a second position in the sparsity bitmap, the second value of the write enable signal corresponding to the processing element that is to process the portion of the compressed data.

14.

发明申请
Apparatus, System, And Method For Persistent User-Level Thread 审中-公开

公开(公告)号：US20170102944A1

公开(公告)日：2017-04-13

申请号：US15386615

申请日：2016-12-21

Applicant: Intel Corporation

Inventor： Gautham Chinya , Hong Wang , Prashant Sethi , Shivnandan Kaushik , Bryant Bigbee , John Shen , Richard Hankins , Xiang Zou , Baiju V. Patel , Jason W. Brandt , Anil Aggarwal , John L. Reid

IPC: G06F9/30 , G06F9/38 , G06F9/46

CPC classification number: G06F9/3005 , G06F9/3009 , G06F9/3851 , G06F9/3861 , G06F9/3877 , G06F9/3885 , G06F9/461

Abstract: Embodiments of the invention provide a method of creating, based on an operating-system-scheduled thread running on an operating-system-visible sequencer and using an instruction set extension, a persistent user-level thread to run on an operating-system-sequestered sequencer independently of context switch activities on the operating-system-scheduled thread. The operating-system-scheduled thread and the persistent user-level thread may share a common virtual address space. Embodiments of the invention may also provide a method of causing a service thread running on an additional operating-system-visible sequencer to provide operating system services to the persistent user-level thread. Embodiments of the invention may further provide apparatus, system, and machine-readable medium thereof.

15.

发明授权
Instruction set architecture-based inter-sequencer communications with a heterogeneous resource 有权
Title translation: 与异构资源的指令集基于架构的间隔器通信

公开(公告)号：US09459874B2

公开(公告)日：2016-10-04

申请号：US14541933

申请日：2014-11-14

Applicant: Intel Corporation

Inventor： Hong Wang , John Shen , Hong Jiang , Richard Hankins , Per Hammarlund , Dion Rodgers , Gautham Chinya , Baiju Patel , Shiv Kaushik , Bryant Bigbee , Gad Sheaffer , Yoav Talgam , Yuval Yosef , James P. Held

IPC: G06F9/38 , G06F9/30 , G06T1/20

CPC classification number: G06F9/30145 , G06F9/30 , G06F9/30181 , G06F9/3877 , G06F9/3879 , G06T1/20

Abstract: In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，本发明包括一种用于在加速器和与其耦合的指令定序器之间直接通信的方法，其中加速器相对于指令定序器是异质资源。可以使用接口来提供这些资源之间的通信。通过这种通信机制，用户级应用可以直接与加速器进行通信，而无需操作系统支持。此外，指令定序器和加速器可以并行地执行操作。描述和要求保护其他实施例。

16.

发明授权
Methods and apparatus to load data within a machine learning accelerator 有权

公开(公告)号：US11922178B2

公开(公告)日：2024-03-05

申请号：US17359392

申请日：2021-06-25

Applicant: Intel Corporation

Inventor： Arnab Raha , Deepak Mathaikutty , Debabrata Mohapatra , Sang Kyun Kim , Gautham Chinya , Cormac Brick

IPC: G06F9/445 , G06F9/30 , G06F9/50 , G06N20/00 , H03K19/177 , H03K19/20

CPC classification number: G06F9/445 , G06F9/3001 , G06F9/5027 , G06N20/00 , H03K19/177 , H03K19/20

Abstract: Methods, apparatus, systems, and articles of manufacture to load data into an accelerator are disclosed. An example apparatus includes data provider circuitry to load a first section and an additional amount of compressed machine learning parameter data into a processor engine. Processor engine circuitry executes a machine learning operation using the first section of compressed machine learning parameter data. A compressed local data re-user circuitry determines if a second section is present in the additional amount of compressed machine learning parameter data. The processor engine circuitry executes a machine learning operation using the second section when the second section is present in the additional amount of compressed machine learning parameter data.

17.

发明公开
METHODS, SYSTEMS, ARTICLES OF MANUFACTURE, AND APPARATUS TO DECODE ZERO-VALUE-COMPRESSION DATA VECTORS 审中-公开

公开(公告)号：US20240022259A1

公开(公告)日：2024-01-18

申请号：US18465495

申请日：2023-09-12

Applicant: Intel Corporation

Inventor： Gautham Chinya , Debabrata Mohapatra , Arnab Raha , Huichu Liu , Cormac Brick

IPC: H03M7/30 , G06F16/22 , G06N3/063

CPC classification number: H03M7/3082 , G06F16/2237 , G06N3/063 , G06N3/08

Abstract: Methods, systems, articles of manufacture, and apparatus are disclosed to decode zero-value-compression data vectors. An example apparatus includes: a buffer monitor to monitor a buffer for a header including a value indicative of compressed data; a data controller to, when the buffer includes compressed data, determine a first value of a sparse select signal based on (1) a select signal and (2) a first position in a sparsity bitmap, the first value of the sparse select signal corresponding to a processing element that is to process a portion of the compressed data; and a write controller to, when the buffer includes compressed data, determine a second value of a write enable signal based on (1) the select signal and (2) a second position in the sparsity bitmap, the second value of the write enable signal corresponding to the processing element that is to process the portion of the compressed data.

18.

发明授权
Multiplication-free approximation for neural networks and sparse coding 有权

公开(公告)号：US11232273B2

公开(公告)日：2022-01-25

申请号：US17067979

申请日：2020-10-12

Applicant: Intel Corporation

Inventor： Gautham Chinya , Shihao Ji , Arnab Paul

IPC: G06K7/10 , G06N20/10 , G06F7/487 , G06F7/483 , G06N3/04 , G06F17/16 , G06K7/14

Abstract: Systems, apparatuses and methods may provide for replacing floating point matrix multiplication operations with an approximation algorithm or computation in applications that involve sparse codes and neural networks. The system may replace floating point matrix multiplication operations in sparse code applications and neural network applications with an approximation computation that applies an equivalent number of addition and/or subtraction operations.

19.

发明授权
Instruction set architecture-based inter-sequencer communications with a heterogeneous resource 有权
Title translation: 与异构资源的指令集基于架构的间隔器通信

公开(公告)号：US09588771B2

公开(公告)日：2017-03-07

申请号：US13791298

申请日：2013-03-08

Applicant: Intel Corporation

Inventor： Hong Wang , John Shen , Hong Jiang , Richard Hankins , Per Hammarlund , Dion Rodgers , Gautham Chinya , Baiju Patel , Shiv Kaushik , Bryant Bigbee , Gad Sheaffer , Yoav Talgam , Yuval Yosef , James P. Held

IPC: G06F9/38 , G06F9/30 , G06T1/20

CPC classification number: G06F9/30145 , G06F9/30 , G06F9/30181 , G06F9/3877 , G06F9/3879 , G06T1/20

Abstract: In one embodiment, the present invention includes a method for directly communicating between an accelerator and an instruction sequencer coupled thereto, where the accelerator is a heterogeneous resource with respect to the instruction sequencer. An interface may be used to provide the communication between these resources. Via such a communication mechanism a user-level application may directly communicate with the accelerator without operating system support. Further, the instruction sequencer and the accelerator may perform operations in parallel. Other embodiments are described and claimed.

Abstract translation: 在一个实施例中，本发明包括一种用于在加速器和与其耦合的指令定序器之间直接通信的方法，其中加速器相对于指令定序器是异质资源。可以使用接口来提供这些资源之间的通信。通过这种通信机制，用户级应用可以直接与加速器进行通信，而无需操作系统支持。此外，指令定序器和加速器可以并行地执行操作。描述和要求保护其他实施例。

20.

发明授权
Performance scaling for dataflow deep neural network hardware accelerators 有权

公开(公告)号：US12141683B2

公开(公告)日：2024-11-12

申请号：US17246341

申请日：2021-04-30

Applicant: Intel Corporation

Inventor： Arnab Raha , Debabrata Mohapatra , Gautham Chinya , Guruguhanathan Venkataramanan , Sang Kyun Kim , Deepak Mathaikutty , Raymond Sung , Cormac Brick

IPC: G06F17/10 , G06F9/30 , G06N3/04 , G06N3/063

Abstract: Embodiments of the present disclosure are directed toward techniques and configurations enhancing the performance of hardware (HW) accelerators. Disclosed embodiments include static MAC scaling arrangement, which includes architectures and techniques for scaling the performance per unit of power and performance per area of HW accelerators. Disclosed embodiments also include dynamic MAC scaling arrangement, which includes architectures and techniques for dynamically scaling the number of active multiply-and-accumulate (MAC) within an HW accelerator based on activation and weight sparsity. Other embodiments may be described and/or claimed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification