Patent search ap:("Intel Corporation") AND inv:"Balaji Vembu" Page 2

11.

发明授权
Machine learning sparse computation mechanism 有权

公开(公告)号：US11803935B2

公开(公告)日：2023-10-31

申请号：US17881720

申请日：2022-08-05

Applicant: Intel Corporation

Inventor： Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Nicolas C. Galoppo Von Borries

IPC: G06F17/16 , G06T1/20 , G06F9/30 , G06F9/38 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , H03M7/30 , G06N20/00 , G06F12/02 , G06F18/2136 , G06F9/48 , G06N3/04 , G06N3/08 , G06T1/60 , G06T15/00

CPC classification number: G06T1/20 , G06F9/3001 , G06F9/3885 , G06F9/4881 , G06F12/0207 , G06F12/0811 , G06F12/0815 , G06F12/0831 , G06F12/0888 , G06F17/16 , G06F18/2136 , G06N3/04 , G06N3/08 , G06N20/00 , G06T1/60 , G06T15/005 , H03M7/30 , G06F2212/1024 , G06F2212/302 , G06F2212/401 , G06F2212/621 , G06T2200/28

Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.

12.

发明授权
Dynamic distributed training of machine learning models 有权

公开(公告)号：US11797837B2

公开(公告)日：2023-10-24

申请号：US15494971

申请日：2017-04-24

Applicant: Intel Corporation

Inventor： Altug Koker , Abhishek R. Appu , Kamal Sinha , Joydeep Ray , Balaji Vembu , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , John C. Weast , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Farshad Akhbari , Nadathur Rajagopalan Satish , Liwei Ma , Jeremy Bottleson , Eriko Nurvitadhi , Travis T. Schluessler , Ankur N. Shah , Jonathan Kennedy , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06N3/08 , G06N20/00 , G06N3/063 , G06N3/044 , G06N3/045 , G06N3/048

CPC classification number: G06N3/08 , G06N3/044 , G06N3/045 , G06N3/063 , G06N20/00 , G06N3/048

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

13.

发明公开
DYNAMIC DISTRIBUTED TRAINING OF MACHINE LEARNING MODELS 审中-公开

公开(公告)号：US20230334316A1

公开(公告)日：2023-10-19

申请号：US18314450

申请日：2023-05-09

Applicant: Intel Corporation

Inventor： Altug Koker , Abhishek R. Appu , Kamal Sinha , Joydeep Ray , Balaji Vembu , Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , John C. Weast , Justin E. Gottschlich , Prasoonkumar Surti , Chandrasekaran Sakthivel , Farshad Akhbari , Nadathur Rajagopalan Satish , Liwei Ma , Jeremy Bottleson , Eriko Nurvitadhi , Travis T. Schluessler , Ankur N. Shah , Jonathan Kennedy , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06N3/08 , G06N20/00 , G06N3/063 , G06N3/044 , G06N3/045

CPC classification number: G06N3/08 , G06N20/00 , G06N3/063 , G06N3/044 , G06N3/045 , G06N3/048

Abstract: Described herein is a graphics processor comprising a memory device and a graphics processing cluster coupled with the memory device. The graphics processing cluster includes a plurality of graphics multiprocessors interconnected via a data interconnect. A graphics multiprocessor includes circuitry configured to load a modular neural network including a plurality of subnetworks, each of the plurality of subnetworks trained to perform a computer vision operation on a separate subject.

14.

发明公开
SCALABLE I/O VIRTUALIZATION INTERRUPT AND SCHEDULING 审中-公开

公开(公告)号：US20230297526A1

公开(公告)日：2023-09-21

申请号：US17832305

申请日：2022-06-03

Applicant: Intel Corporation

Inventor： David Puffer , Ankur Shah , Niranjan Cooray , Bryan White , Balaji Vembu , Hema Chand Nalluri , Kritika Bala

IPC: G06F13/24 , G06F13/16

CPC classification number: G06F13/24 , G06F13/1668 , G06T1/20

Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.

15.

发明授权
Scalable I/O virtualization interrupt and scheduling 有权

公开(公告)号：US11748283B1

公开(公告)日：2023-09-05

申请号：US17832305

申请日：2022-06-03

Applicant: Intel Corporation

Inventor： David Puffer , Ankur Shah , Niranjan Cooray , Bryan White , Balaji Vembu , Hema Chand Nalluri , Kritika Bala

IPC: G06F9/48 , G06F12/084 , G06F13/24 , G06F13/16 , G06T1/20

CPC classification number: G06F13/24 , G06F13/1668 , G06T1/20

Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.

16.

发明授权
Data operations and finite state machine for machine learning via bypass of computational tasks based on frequently-used data values 有权

公开(公告)号：US11748106B2

公开(公告)日：2023-09-05

申请号：US17683564

申请日：2022-03-01

Applicant: Intel Corporation

Inventor： Liwei Ma , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Eriko Nurvitadhi , Abhishek R. Appu , Altug Koker , Kamal Sinha , Joydeep Ray , Balaji Vembu , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/38

CPC classification number: G06F9/3832

Abstract: A mechanism is described for facilitating fast data operations and for facilitating a finite state machine for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting input data to be used in computational tasks by a computation component of a processor including a graphics processor. The method may further include determining one or more frequently-used data values (FDVs) from the data, and pushing the one or more frequent data values to bypass the computational tasks.

17.

发明公开
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20230260072A1

公开(公告)日：2023-08-17

申请号：US18168207

申请日：2023-02-13

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F9/455 , G06F9/50 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045

CPC classification number: G06T1/20 , G06F9/45533 , G06F9/5061 , G06F9/5094 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045 , G06F8/41

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

18.

发明授权
Instructions and logic to perform floating point and integer operations for machine learning 有权

公开(公告)号：US11720355B2

公开(公告)日：2023-08-08

申请号：US17834482

申请日：2022-06-07

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G09G5/393 , G06F9/38 , G06F7/483 , G06F7/544 , G06N3/063 , G06N3/08 , G06N3/044 , G06N3/045 , G06T15/00 , G06N20/00 , G06F17/16

CPC classification number: G06F9/3001 , G06F7/483 , G06F7/5443 , G06F9/30014 , G06F9/30036 , G06F9/3851 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G09G5/393 , G06F9/3013 , G06F9/30025 , G06F17/16 , G06F2207/3824 , G06N20/00 , G06T15/005

Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.

19.

发明公开
Regional Adjustment of Render Rate 审中-公开

公开(公告)号：US20230142472A1

公开(公告)日：2023-05-11

申请号：US17959374

申请日：2022-10-04

Applicant: Intel Corporation

Inventor： Eric J. Asperheim , Subramaniam Maiyuran , Kiran C. Veernapu , Sanjeev S. Jahagirdar , Balaji Vembu , Devan Burke , Philip R. Laws , Kamal Sinha , Abhishek R. Appu , Elmoustapha Ould-Ahmed-Vall , Peter L. Doyle , Joydeep Ray , Travis T. Schluessler , John H. Feit , Nikos Kaburlasos , Jacek Kwiatkowski , Altug Koker

IPC: G06F3/14 , G06F3/01 , G09G5/391 , G06F3/0484

CPC classification number: G06F3/1438 , G06F3/013 , G09G5/391 , G06F3/0484 , G09G2354/00 , G09G2352/00 , G09G2360/08 , G09G2340/0435 , G09G2360/121 , G09G5/001

Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.

20.

发明申请
INSTRUCTIONS AND LOGIC TO PERFORM FLOATING POINT AND INTEGER OPERATIONS FOR MACHINE LEARNING 有权

公开(公告)号：US20230046506A1

公开(公告)日：2023-02-16

申请号：US17967283

申请日：2022-10-17

Applicant: Intel Corporation

Inventor： Himanshu Kaul , Mark A. Anders , Sanu K. Mathew , Anbang Yao , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Tatiana Shpeisman , Abhishek R. Appu , Altug Koker , Kamal Sinha , Balaji Vembu , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Rajkishore Barik , Tsung-Han Lin , Vasanth Ranganathan , Sanjeev Jahagirdar

IPC: G06F9/30 , G06F7/483 , G06N3/063 , G06N3/04 , G06F9/38 , G06N3/08 , G09G5/393 , G06F7/544 , G06T15/00 , G06N20/00 , G06F17/16

Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification