-
公开(公告)号:US20230260072A1
公开(公告)日:2023-08-17
申请号:US18168207
申请日:2023-02-13
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
CPC classification number: G06T1/20 , G06F9/45533 , G06F9/5061 , G06F9/5094 , G06N3/063 , G06N3/084 , G06N3/044 , G06N3/045 , G06F8/41
Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.
-
公开(公告)号:US20230014565A1
公开(公告)日:2023-01-19
申请号:US17849201
申请日:2022-06-24
Applicant: Intel Corporation
Inventor: Joydeep Ray , Altug Koker , Varghese George , Mike Macpherson , Aravindh Anantaraman , Abhishek R. Appu , Elmoustapha Ould-Ahmed-Vall , Nicolas Galoppo von Borries , Ben J. Ashbaugh
IPC: G06F12/0888 , G06T1/60 , G06T1/20
Abstract: Embodiments described herein provide techniques to facilitate instruction-based control of memory attributes. One embodiment provides a graphics processor comprising a processing resource, a memory device, a cache coupled with the processing resources and the memory, and circuitry to process a memory access message received from the processing resource. The memory access message enables access to data of the memory device. To process the memory access message, the circuitry is configured to determine one or more cache attributes that indicate whether the data should be read from or stored the cache. The cache attributes may be provided by the memory access message or stored in state data associated with the data to be accessed by the access message.
-
公开(公告)号:US20220382555A1
公开(公告)日:2022-12-01
申请号:US17839856
申请日:2022-06-14
Applicant: Intel Corporation
Inventor: ELMOUSTAPHA OULD-AHMED-VALL , BARATH LAKSHMANAN , TATIANA SHPEISMAN , Joydeep Ray , Ping T. Tang , Michael Strickland , Xiaoming Chen , Anbang Yao , Ben J. Ashbaugh , Linda L. Hurd , Liwei Ma
IPC: G06F9/38 , G06F9/30 , G06F13/42 , G06F13/40 , G06N20/00 , G06T1/20 , G06N3/04 , G06N3/063 , G06N3/08 , G06N20/10 , G06F9/50 , G06F15/80 , G06N3/00
Abstract: One embodiment provides for a graphics processing unit (GPU) to accelerate machine learning operations, the GPU comprising an instruction cache to store a first instruction and a second instruction, the first instruction to cause the GPU to perform a floating-point operation, including a multi-dimensional floating-point operation, and the second instruction to cause the GPU to perform an integer operation; and a general-purpose graphics compute unit having a single instruction, multiple thread architecture, the general-purpose graphics compute unit to concurrently execute the first instruction and the second instruction.
-
公开(公告)号:US20220335562A1
公开(公告)日:2022-10-20
申请号:US17741934
申请日:2022-05-11
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.
-
公开(公告)号:US20220156876A1
公开(公告)日:2022-05-19
申请号:US17529862
申请日:2021-11-18
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes one or more processing units to provide a first set of shader operations associated with a shader stage of a graphics pipeline, a scheduler to schedule shader threads for processing, and a field-programmable gate array (FPGA) dynamically configured to provide a second set of shader operations associated with the shader stage of the graphics pipeline.
-
公开(公告)号:US11334962B2
公开(公告)日:2022-05-17
申请号:US17385693
申请日:2021-07-26
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of processing cores of a first type and a second type. A first set of processing cores of a first type perform multi-dimensional matrix operations and a second set of processing cores of a second type perform general purpose graphics processing unit (GPGPU) operations.
-
公开(公告)号:US20210350499A1
公开(公告)日:2021-11-11
申请号:US17385693
申请日:2021-07-26
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of processing cores of a first type and a second type. A first set of processing cores of a first type perform multi-dimensional matrix operations and a second set of processing cores of a second type perform general purpose graphics processing unit (GPGPU) operations.
-
公开(公告)号:US20210279571A1
公开(公告)日:2021-09-09
申请号:US17177632
申请日:2021-02-17
Applicant: Intel Corporation
Inventor: Narayan Srinivasa , Joydeep Ray , Nicolas C. Galoppo Von Borries , Ben J. Ashbaugh , Prasoonkumar Surti , Feng Chen , Barath Lakshmanan , Elmoustapha Ould-Ahmed-Vall , Liwei Ma , Linda L. Hurd , Abhishek R. Appu , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Chandrasekaran Sakthivel , Farshad Akhbari , Dukhwan Kim , Altug Koker , Nadathur Rajagopalan Satish
Abstract: An apparatus to facilitate optimization of a neural network (NN) is disclosed. The apparatus includes optimization logic to define a NN topology having one or more macro layers, adjust the one or more macro layers to adapt to input and output components of the NN and train the NN based on the one or more macro layers.
-
公开(公告)号:US20210241417A1
公开(公告)日:2021-08-05
申请号:US17145885
申请日:2021-01-11
Applicant: Intel Corporation
Inventor: Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu
Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type.
-
公开(公告)号:US20200210338A1
公开(公告)日:2020-07-02
申请号:US16727127
申请日:2019-12-26
Applicant: Intel Corporation
Inventor: Chandrasekaran Sakthivel , Prasoonkumar Surti , John C. Weast , Sara S. Baghsorkhi , Justin E. Gottschlich , Abhishek R. Appu , Nicolas C. Galoppo Von Borries , Joydeep Ray , Narayan Srinivasa , Feng Chen , Ben J. Ashbaugh , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Eriko Nurvitadhi , Balaji Vembu , Altug Koker
IPC: G06F12/0837 , G06N3/08 , G06N20/00 , G06T1/20 , G06F12/0815 , G06N3/063 , G06N3/04
Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.
-
-
-
-
-
-
-
-
-