-
公开(公告)号:US10445451B2
公开(公告)日:2019-10-15
申请号:US15640535
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, Jr. , Ping Tak Peter Tang
IPC: G06F17/50 , G06F12/0802 , H03K19/177 , G06F15/78 , G06F15/80 , G11C8/12
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. At least one of the plurality of processing elements includes a plurality of control inputs.
-
公开(公告)号:US10445098B2
公开(公告)日:2019-10-15
申请号:US15721809
申请日:2017-09-30
Applicant: INTEL CORPORATION
Inventor: Kermin E. Fleming , Simon C. Steely , Kent D. Glossop
Abstract: Methods and apparatuses relating to privileged configuration in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; and a configuration controller coupled to a first subset and a second, different subset of the plurality of processing elements, the first subset having an output coupled to an input of the second, different subset, wherein the configuration controller is to configure the interconnect network between the first subset and the second, different subset of the plurality of processing elements to not allow communication on the interconnect network between the first subset and the second, different subset when a privilege bit is set to a first value and to allow communication on the interconnect network between the first subset and the second, different subset of the plurality of processing elements when the privilege bit is set to a second value.
-
13.
公开(公告)号:US20190205284A1
公开(公告)日:2019-07-04
申请号:US15859466
申请日:2017-12-30
Applicant: Intel Corporation
Inventor: Kermin E. Fleming , Simon C. Steely, Jr. , Kent D. Glossop
IPC: G06F15/173 , G06F9/54
CPC classification number: G06F15/17331 , G06F9/542
Abstract: Methods and apparatuses relating to consistency in an accelerator are described. In one embodiment, request address file (RAF) circuits are coupled to a spatial array by a first network, a memory is coupled to the RAF circuits by a second network, a RAF circuit is to not issue, into the second network, a request to the memory marked with a program order dependency on a previous request until receiving a first token generated by completion of the previous request to the memory by another RAF circuit, and a second RAF circuit is to not issue, into the second network, a second request to the memory marked with a program order dependency on a first request until receiving a second token sent by a first RAF circuit when a predetermined time period has lapsed since the first request was issued by the first RAF circuit into the second network.
-
14.
公开(公告)号:US20190095369A1
公开(公告)日:2019-03-28
申请号:US15719285
申请日:2017-09-28
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, JR.
IPC: G06F13/28 , G06F12/0813 , G06F12/0811
Abstract: Systems, methods, and apparatuses relating to a memory fence mechanism in a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a plurality of operations, each by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. The processor also includes a fence manager to manage a memory fence between a first operation and a second operation of the plurality of operations.
-
15.
公开(公告)号:US20190004945A1
公开(公告)日:2019-01-03
申请号:US15640533
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, JR. , Samantika S. Sury
IPC: G06F12/0802 , G06F17/50 , H03K19/177
CPC classification number: G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0815 , G06F15/7867 , G06F15/8015 , G06F15/825 , G06F17/505 , G11C7/1012 , G11C8/12 , G11C2207/2245 , H03K19/17736 , H03K19/17756 , H03K19/1776 , H03K19/17764 , H03K19/17776 , H03K19/1778
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In an embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an atomic operation when an incoming operand set arrives at the plurality of processing elements.
-
公开(公告)号:US11693691B2
公开(公告)日:2023-07-04
申请号:US17381521
申请日:2021-07-21
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
CPC classification number: G06F9/48 , G06F9/3001 , G06F9/3004 , G06F9/30036 , G06F9/383
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
公开(公告)号:US11093277B2
公开(公告)日:2021-08-17
申请号:US16913265
申请日:2020-06-26
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
公开(公告)号:US11086816B2
公开(公告)日:2021-08-10
申请号:US15719281
申请日:2017-09-28
Applicant: Intel Corporation
Inventor: Kermin Fleming , Simon C. Steely, Jr. , Kent D. Glossop
IPC: G06F15/80
Abstract: Systems, methods, and apparatuses relating to debugging a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. At least a first of the plurality of processing elements is to enter a halted state in response to being represented as a first of the plurality of dataflow operators.
-
19.
公开(公告)号:US10817291B2
公开(公告)日:2020-10-27
申请号:US16370915
申请日:2019-03-30
Applicant: Intel Corporation
Inventor: Jesus Corbal , Rohan Sharma , Simon Steely, Jr. , Chinmay Ashok , Kent D. Glossop , Dennis Bradford , Paul Caprioli , Louise Huot , Kermin ChoFleming , Barry Tannenbaum
Abstract: Systems, methods, and apparatuses relating to swizzle operations and disable operations in a configurable spatial accelerator (CSA) are described. Certain embodiments herein provide for an encoding system for a specific set of swizzle primitives across a plurality of packed data elements in a CSA. In one embodiment, a CSA includes a plurality of processing elements, a circuit switched interconnect network between the plurality of processing elements, and a configuration register within each processing element to store a configuration value having a first portion that, when set to a first value that indicates a first mode, causes the processing element to pass an input value to operation circuitry of the processing element without modifying the input value, and, when set to a second value that indicates a second mode, causes the processing element to perform a swizzle operation on the input value to form a swizzled input value before sending the swizzled input value to the operation circuitry of the processing element, and a second portion that causes the processing element to perform an operation indicated by the second portion the configuration value on the input value in the first mode and the swizzled input value in the second mode with the operation circuitry.
-
公开(公告)号:US10515046B2
公开(公告)日:2019-12-24
申请号:US15640543
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, Jr.
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a synchronizer circuit coupled between an interconnect network of a first tile and an interconnect network of a second tile and comprising storage to store data to be sent between the interconnect network of the first tile and the interconnect network of the second tile, the synchronizer circuit to convert the data from the storage between a first voltage or a first frequency of the first tile and a second voltage or a second frequency of the second tile to generate converted data, and send the converted data between the interconnect network of the first tile and the interconnect network of the second tile
-
-
-
-
-
-
-
-
-