Patent search ap:("Intel Corporation") AND inv:"Gautham Chinya" Page 1

1.

发明授权
Multiplication-free approximation for neural networks and sparse coding 有权

公开(公告)号：US11714977B2

公开(公告)日：2023-08-01

申请号：US17554255

申请日：2021-12-17

Applicant: Intel Corporation

Inventor： Gautham Chinya , Shihao Ji , Arnab Paul

IPC: G06K7/10 , G06N20/10 , G06F7/487 , G06F7/483 , G06N3/045 , G06F17/16 , G06K7/14 , G06N3/04

CPC classification number: G06K7/10722 , G06F7/483 , G06F7/487 , G06F17/16 , G06K7/1413 , G06N3/04 , G06N3/045 , G06N20/10 , G06F2207/4824

Abstract: Systems, apparatuses and methods may provide for replacing floating point matrix multiplication operations with an approximation algorithm or computation in applications that involve sparse codes and neural networks. The system may replace floating point matrix multiplication operations in sparse code applications and neural network applications with an approximation computation that applies an equivalent number of addition and/or subtraction operations.

2.

发明申请
ACCELERATED LOADING OF UNSTRUCTURED SPARSE DATA IN MACHINE LEARNING ARCHITECTURES 有权

公开(公告)号：US20210042617A1

公开(公告)日：2021-02-11

申请号：US17081509

申请日：2020-10-27

Applicant: Intel Corporation

Inventor： Gautham Chinya , Deepak Mathaikutty , Guruguhanathan Venkataramanan , Debabrata Mohapatra , Moongon Jung , Sang Kyun Kim , Arnab Raha , Cormac Brick

IPC: G06N3/063 , G06N3/04

Abstract: Systems, apparatuses and methods may provide for technology that identify an assignment of weights of a workload to a plurality of processing elements, where the workload is to be associated with a neural network. The technology generates a representation that is to represent whether each of the weights is a zero value or a non-zero value. The technology further stores the representation into partitions of a storage structure based on the assignment of the weights, where the partitions are each to be dedicated to a different one of the processing elements.

3.

发明申请
Apparatus, System, And Method For Persistent User-Level Thread 有权

公开(公告)号：US20160274910A1

公开(公告)日：2016-09-22

申请号：US15166469

申请日：2016-05-27

Applicant: Intel Corporation

Inventor： Gautham Chinya , Hong Wang , Prashant Sethi , Shivnandan Kaushik , Bryant Bigbee , John Shen , Richard Hankins , Xiang Zou , Baiju V. Patel , Jason W. Brandt , Anil Aggarwal , John L. Reid

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/3005 , G06F9/3009 , G06F9/3851 , G06F9/3861 , G06F9/3877 , G06F9/3885 , G06F9/461

Abstract: Embodiments of the invention provide a method of creating, based on an operating-system-scheduled thread running on an operating-system-visible sequencer and using an instruction set extension, a persistent user-level thread to run on an operating-system-sequestered sequencer independently of context switch activities on the operating-system-scheduled thread. The operating-system-scheduled thread and the persistent user-level thread may share a common virtual address space. Embodiments of the invention may also provide a method of causing a service thread running on an additional operating-system-visible sequencer to provide operating system services to the persistent user-level thread. Embodiments of the invention may further provide apparatus, system, and machine-readable medium thereof.

4.

发明申请
PERFORMANCE SCALING FOR DATAFLOW DEEP NEURAL NETWORK HARDWARE ACCELERATORS 有权

公开(公告)号：US20250036928A1

公开(公告)日：2025-01-30

申请号：US18907748

申请日：2024-10-07

Applicant: Intel Corporation

Inventor： Arnab Raha , Debabrata Mohapatra , Gautham Chinya , Guruguhanathan Venkataramanan , Sang Kyun Kim , Deepak Mathaikutty , Raymond Sung , Cormac Brick

IPC: G06N3/063 , G06F9/30 , G06N3/04

Abstract: Embodiments of the present disclosure are directed toward techniques and configurations enhancing the performance of hardware (HW) accelerators. Disclosed embodiments include static MAC scaling arrangement, which includes architectures and techniques for scaling the performance per unit of power and performance per area of HW accelerators. Disclosed embodiments also include dynamic MAC scaling arrangement, which includes architectures and techniques for dynamically scaling the number of active multiply-and-accumulate (MAC) within an HW accelerator based on activation and weight sparsity. Other embodiments may be described and/or claimed.

5.

发明公开
METHODS AND APPARATUS TO LOAD DATA WITHIN A MACHINE LEARNING ACCELERATOR 审中-公开

公开(公告)号：US20240231839A1

公开(公告)日：2024-07-11

申请号：US18416303

申请日：2024-01-18

Applicant: Intel Corporation

Inventor： Arnab Raha , Deepak Mathaikutty , Debabrata Mohapatra , Sang Kyun Kim , Gautham Chinya , Cormac Brick

IPC: G06F9/445 , G06F9/30 , G06F9/50 , G06N20/00 , H03K19/177 , H03K19/20

CPC classification number: G06F9/445 , G06F9/3001 , G06F9/5027 , G06N20/00 , H03K19/177 , H03K19/20

Abstract: Methods, apparatus, systems, and articles of manufacture to load data into an accelerator are disclosed. An example apparatus includes data provider circuitry to load a first section and an additional amount of compressed machine learning parameter data into a processor engine. Processor engine circuitry executes a machine learning operation using the first section of compressed machine learning parameter data. A compressed local data re-user circuitry determines if a second section is present in the additional amount of compressed machine learning parameter data. The processor engine circuitry executes a machine learning operation using the second section when the second section is present in the additional amount of compressed machine learning parameter data.

6.

发明申请
MULTI-BUFFERED REGISTER FILES WITH SHARED ACCESS CIRCUITS 有权

公开(公告)号：US20210117197A1

公开(公告)日：2021-04-22

申请号：US17132895

申请日：2020-12-23

Applicant: Intel Corporation

Inventor： Steven Hsu , Amit Agarwal , Debabrata Mohapatra , Arnab Raha , Moongon Jung , Gautham Chinya , Ram Krishnamurthy

IPC: G06F9/30 , G06F13/16 , G06F15/78 , G06N3/04

Abstract: Systems, apparatuses and methods identify a plurality of registers that are associated with a system-on-chip. The plurality of registers includes a first portion dedicated to write operations and a second portion dedicated to read operations. The technology writes data to the first portion of the plurality of registers, and transfers the data from the first portion to the second portion.

7.

发明申请
MULTIPLICATION-FREE APPROXIMATION FOR NEURAL NETWORKS AND SPARSE CODING 审中-公开

公开(公告)号：US20190130148A1

公开(公告)日：2019-05-02

申请号：US16306736

申请日：2016-06-29

Applicant: Intel Corporation

Inventor： Gautham Chinya , Shihao Ji , Arnab Paul

IPC: G06K7/10 , G06F17/16 , G06N20/10 , G06N3/04 , G06K7/14

Abstract: Systems, apparatuses and methods may provide for replacing floating point matrix multiplication operations with an approximation algorithm or computation in applications that involve sparse codes and neural networks. The system may replace floating point matrix multiplication operations in sparse code applications and neural network applications with an approximation computation that applies an equivalent number of addition and/or subtraction operations.

8.

发明公开
Schedule-Aware Tensor Distribution Module 审中-公开

公开(公告)号：US20240220785A1

公开(公告)日：2024-07-04

申请号：US18408716

申请日：2024-01-10

Applicant: Intel Corporation

Inventor： Gautham Chinya , Huichu Liu , Arnab Raha , Debabrata Mohapatra , Cormac Brick , Lance Hacking

IPC: G06N3/063 , G06F9/38 , G06F9/448 , G06F9/50 , G06N5/04

CPC classification number: G06N3/063 , G06F9/3814 , G06F9/3877 , G06F9/4498 , G06F9/5027 , G06N5/04

Abstract: Methods and systems include a neural network system that includes a neural network accelerator comprising. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.

9.

发明授权
Schedule-aware tensor distribution module 有权

公开(公告)号：US11907827B2

公开(公告)日：2024-02-20

申请号：US16456707

申请日：2019-06-28

Applicant: Intel Corporation

Inventor： Gautham Chinya , Huichu Liu , Arnab Raha , Debabrata Mohapatra , Cormac Brick , Lance Hacking

IPC: G06N3/063 , G06N5/04 , G06F9/448 , G06F9/38 , G06F9/50

CPC classification number: G06N3/063 , G06F9/3814 , G06F9/3877 , G06F9/4498 , G06F9/5027 , G06N5/04

Abstract: Methods and systems include a neural network system that includes a neural network accelerator. The neural network accelerator includes multiple processing engines coupled together to perform arithmetic operations in support of an inference performed using the deep neural network system. The neural network accelerator also includes a schedule-aware tensor data distribution circuitry or software that is configured to load tensor data into the multiple processing engines in a load phase, extract output data from the multiple processing engines in an extraction phase, reorganize the extracted output data, and store the reorganized extracted output data to memory.

10.

发明授权
Methods, systems, articles of manufacture, and apparatus to decode zero-value-compression data vectors 有权

公开(公告)号：US11804851B2

公开(公告)日：2023-10-31

申请号：US16832804

申请日：2020-03-27

Applicant: Intel Corporation

Inventor： Gautham Chinya , Debabrata Mohapatra , Arnab Raha , Huichu Liu , Cormac Brick

IPC: H03M7/30 , G06F16/22 , G06N3/063 , G06N3/08 , G06N3/04

CPC classification number: H03M7/3082 , G06F16/2237 , G06N3/063 , G06N3/04 , G06N3/08

Abstract: Methods, systems, articles of manufacture, and apparatus are disclosed to decode zero-value-compression data vectors. An example apparatus includes: a buffer monitor to monitor a buffer for a header including a value indicative of compressed data; a data controller to, when the buffer includes compressed data, determine a first value of a sparse select signal based on (1) a select signal and (2) a first position in a sparsity bitmap, the first value of the sparse select signal corresponding to a processing element that is to process a portion of the compressed data; and a write controller to, when the buffer includes compressed data, determine a second value of a write enable signal based on (1) the select signal and (2) a second position in the sparsity bitmap, the second value of the write enable signal corresponding to the processing element that is to process the portion of the compressed data.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification