Patent search ap:("Intel Corporation") AND inv:"Ben J. Ashbaugh" Page 2

11.

发明授权
Compute optimizations for low precision machine learning operations 有权

公开(公告)号：US10853906B2

公开(公告)日：2020-12-01

申请号：US16197821

申请日：2018-11-21

Applicant: Intel Corporation

Inventor： Elmoustapha Ould-Ahmed-Vall , Sara S. Baghsorkhi , Anbang Yao , Kevin Nealis , Xiaoming Chen , Altug Koker , Abhishek R. Appu , John C. Weast , Mike B. Macpherson , Dukhwan Kim , Linda L. Hurd , Ben J. Ashbaugh , Barath Lakshmanan , Liwei Ma , Joydeep Ray , Ping T. Tang , Michael S. Strickland

IPC: G06T1/20 , G06F7/483 , G06N3/08 , G06F9/30 , G06N3/04 , G06N3/063 , G06F9/50 , G06F9/38 , G06N20/00 , G06F3/14 , G06T1/60 , G06T15/00

Abstract: One embodiment provides an accelerator module comprising a memory stack including multiple memory dies; a graphics processing unit (GPU) coupled with the memory stack via one or more memory controllers, the GPU including a plurality of multiprocessors having a single instruction, multiple thread (SIMT) architecture, the multiprocessors to execute at least one single instruction. The at least one single instruction is to cause at least a portion of the GPU to perform a floating point operation on input having differing precisions. The floating point operation is a two-dimensional matrix multiply and accumulate operation.

12.

发明申请
SYSTEM AND METHOD TO SUPPORT MULTIPLE WALKERS PER COMMAND 审中-公开

公开(公告)号：US20200286201A1

公开(公告)日：2020-09-10

申请号：US16297129

申请日：2019-03-08

Applicant: Intel Corporation

Inventor： James Valerio , Vasanth Ranganathan , Joydeep Ray , Abhishek R. Appu , Ben J. Ashbaugh , Brandon Fliflet , Jeffery S. Boles , Srinivasan Embar Raghukrishnan , Rahul Kulkarni

IPC: G06T1/20 , G06T1/60

Abstract: Embodiments described herein provide an apparatus comprising a processor to configure a plurality of contexts of a command engine to execute a graphics workload comprising a plurality of walkers, allocate, from a pool of execution units of a graphics processor, a subset of execution units to each walker in the plurality of walkers based at least in part on the predetermined number of walkers configured for the context, for each context in the plurality of contexts, dispatch one or more walkers of the plurality of walkers to the execution units, and upon dispatch of the one or more walkers of the plurality of walkers, write an opcode to a computer-readable memory indicating that the dispatch of the walker is complete, wherein the opcode comprises dependency data for the one or more walkers of the plurality of walkers. Other embodiments may be described and claimed.

13.

发明授权
Compute optimization mechanism for deep neural networks 有权

公开(公告)号：US10417734B2

公开(公告)日：2019-09-17

申请号：US15698217

申请日：2017-09-07

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06T1/60 , G09G5/36 , G06F3/06 , G06N3/08 , G06F3/14 , G06N3/04 , G06N3/063 , G09G5/00

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a memory device including a first integrated circuit (IC) including a plurality of memory channels and a second IC including a plurality of processing units, each coupled to a memory channel in the plurality of memory channels.

14.

发明授权
Facilitating execution-aware hybrid preemption for execution of tasks in computing environments 有权

公开(公告)号：US10338953B2

公开(公告)日：2019-07-02

申请号：US15074325

申请日：2016-03-18

Applicant: INTEL CORPORATION

Inventor： Ben J. Ashbaugh , Raun M. Krisch

IPC: G06F9/46 , G06F13/32 , G06F13/24 , G06F9/38 , G06F9/48 , G06T1/20

Abstract: A mechanism is described for facilitating execution-aware hybrid preemption for execution of tasks in computing environments. A method of embodiments, as described herein, includes detecting a software application being hosted by a computing device, where the software applications to facilitate one or more tasks that are capable of being executed by a graphics processor of the computing device. The method may further include selecting at least one of a fine grain preemption or a coarse grain preemption based on comparison of a first time estimation and a second time estimation relating to the one or more tasks at thread level execution and work group level execution, respectively. The method may further include initiating performance of the selected one of the fine grain preemption and the coarse grain preemption.

15.

发明授权
Extend GPU/CPU coherency to multi-GPU cores 有权

公开(公告)号：US10261903B2

公开(公告)日：2019-04-16

申请号：US15489149

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker

IPC: G06F12/0837 , G06N3/08 , G06N20/00 , G06T1/20

Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.

16.

发明申请
CONVOLUTIONAL NEURAL NETWORK OPTIMIZATION MECHANISM 审中-公开

公开(公告)号：US20180300600A1

公开(公告)日：2018-10-18

申请号：US15488551

申请日：2017-04-17

Applicant: Intel Corporation

Inventor： Liwei Ma , Elmoustapha Ould- Ahmed-Vall , Barath Lakshmanan , Ben J. Ashbaugh , Jingyi Jin , Jeremy Bottleson , Mike B. Macpherson , Kevin Nealis , Dhawal Srivastava , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman , Altug Koker , Abhishek R. Appu

IPC: G06N3/04 , G06N3/08 , G06N3/063

Abstract: An apparatus to facilitate optimization of a convolutional neural network (CNN) is disclosed. The apparatus includes optimization logic to receive a CNN model having a list of instructions and including pruning logic to optimize the list of instructions by eliminating branches in the list of instructions that comprise a weight value of 0.

17.

发明授权
Providing native support for generic pointers in a graphics processing unit 有权

公开(公告)号：US12299766B2

公开(公告)日：2025-05-13

申请号：US17484066

申请日：2021-09-24

Applicant: Intel Corporation

Inventor： Joydeep Ray , Prathamesh Raghunath Shinde , Ben J. Ashbaugh , Wei-Yu Chen , Abhishek R. Appu , Vasanth Ranganathan , Dmitry Yurievich Babokin , Ankur N. Shah

IPC: G06T1/20 , G06F9/30 , G06F9/38 , G06T1/60

Abstract: Systems and methods for supporting generic pointers in hardware of a graphics processing unit (GPU) are provided. In various examples, a GPU includes multiple sub-cores each having a processing resource and a load/store pipeline. The processing resource is operable to receive a memory access message including a pointer and a memory type identifier indicative of the pointer representing a generic pointer. The processing resource is further operable to output a load or store operation to the load/store pipeline based on the memory access message, including computing an address for the load or store operation by adding a base address of a named memory type of a plurality of named memory types referenced by the generic pointer to an offset into a memory of the named memory type. The load/store pipeline is operable to, responsive to receipt of the load or store operation, access the memory at the address.

18.

发明公开
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20240257294A1

公开(公告)日：2024-08-01

申请号：US18436494

申请日：2024-02-08

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F8/41 , G06F9/455 , G06F9/50 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084

CPC classification number: G06T1/20 , G06F9/45533 , G06F9/5061 , G06F9/5094 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F8/41 , G06F2009/45583

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

19.

发明公开
INCREASING PROCESSING RESOURCES IN PROCESSING CORES OF A GRAPHICS ENVIRONMENT 审中-公开

公开(公告)号：US20240160478A1

公开(公告)日：2024-05-16

申请号：US17987185

申请日：2022-11-15

Applicant: Intel Corporation

Inventor： Jiasheng Chen , Chunhui Mei , Ben J. Ashbaugh , Naveen Matam , Joydeep Ray , Timothy Bauer , Guei-Yuan Lueh , Vasanth Ranganathan , Prashant Chaudhari , Vikranth Vemulapalli , Nishanth Reddy Pendluru , Piotr Reiter , Jain Philip , Marek Rudniewski , Christopher Spencer , Parth Damani , Prathamesh Raghunath Shinde , John Wiegert , Fataneh Ghodrat

IPC: G06F9/50 , G06F12/0875

CPC classification number: G06F9/5016 , G06F12/0875 , G06F2212/452

Abstract: An apparatus to facilitate increasing processing resources in processing cores of a graphics environment is disclosed. The apparatus includes a plurality of processing resources to execute one or more execution threads; a plurality of message arbiter-processing resource (MA-PR) routers, wherein a respective MA-PR router of the plurality of MA-PR routers corresponds to a pair of processing resources of the plurality of processing resources and is to arbitrate routing of a thread control message from a message arbiter between the pair of processing resources; a plurality of local shared cache (LSC) sequencers to provide an interface between at least one LSC of the processing core and the plurality of processing resources; and a plurality of instruction caches (ICs) to store instructions of the one or more execution threads, wherein a respective IC of the plurality of ICs interfaces with a portion of the plurality of processing resources.

20.

发明授权
Compute optimization mechanism for deep neural networks 有权

公开(公告)号：US11922535B2

公开(公告)日：2024-03-05

申请号：US18168207

申请日：2023-02-13

Applicant: Intel Corporation

Inventor： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC: G06T1/20 , G06F9/455 , G06F9/50 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F8/41

CPC classification number: G06T1/20 , G06F9/45533 , G06F9/5061 , G06F9/5094 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/084 , G06F8/41 , G06F2009/45583

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification