-
公开(公告)号:US10467183B2
公开(公告)日:2019-11-05
申请号:US15640538
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming, Jr. , Simon C. Steely, Jr. , Kent D. Glossop
Abstract: Methods and apparatuses relating to pipelined runtime services in spatial arrays are described. In one embodiment, a processor includes processing elements; an interconnect network between the processing elements; a first configuration controller coupled to a first subset of the processing elements; and a second configuration controller coupled to a second, different subset of the processing elements, the first configuration controller and the second configuration controller are to configure the first subset and the second, different subset according to configuration information for a first context, and, for a context switch, the first configuration controller is to configure the first subset according to configuration information for a second context after pending operations of the first context are completed in the first subset and block second context dataflow into the second, different subset's input from the first subset's output until pending operations of the first context are completed in the second, different subset.
-
公开(公告)号:US10346144B2
公开(公告)日:2019-07-09
申请号:US15721454
申请日:2017-09-29
Applicant: Intel Corporation
Inventor: Yongzhi Zhang , Kent D. Glossop
Abstract: Methods, apparatus, systems and articles of manufacture to map a set of instructions onto a data flow graph are disclosed herein. An example apparatus includes a variable handler to modify a variable in the set of instructions. The variable is used multiple times in the set of instructions and the set of instructions are in a static single assignment form. The apparatus also includes a PHI handler to replace a PHI instruction contained in the set of instructions with a set of control data flow instructions and a data flow graph generator to map the set of instructions modified by the variable handler and the PHI handler onto a data flow graph without transforming the instructions out of the static single assignment form.
-
23.
公开(公告)号:US20190005161A1
公开(公告)日:2019-01-03
申请号:US15640535
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin Fleming , Kent D. Glossop , Simon C. Steely, JR. , Ping Tak Peter Tang
IPC: G06F17/50 , G06F15/78 , G06F12/0802
CPC classification number: G06F17/505 , G06F12/0802 , G06F12/0862 , G06F12/0888 , G06F12/0895 , G06F15/7867 , G06F15/8015 , G11C8/12 , H03K19/17736 , H03K19/17756 , H03K19/17764 , H03K19/17776 , H03K19/1778
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. At least one of the plurality of processing elements includes a plurality of control inputs.
-
公开(公告)号:US20190004955A1
公开(公告)日:2019-01-03
申请号:US15640534
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Michael C. Adler , Chiachen Chou , Neal C. Crago , Kermin Fleming , Kent D. Glossop , Aamer Jaleel , Pratik M. Marolia , Simon C. Steely, JR. , Samantika S. Sury
IPC: G06F12/0862 , G06F12/0802 , H03K19/177 , G06F15/78
CPC classification number: G06F12/0862 , G06F12/0802 , G06F15/7867 , G06F15/8015 , G06F17/505 , G06F2212/6026 , G11C8/12 , H03K19/17736 , H03K19/17756 , H03K19/1776 , H03K19/17764 , H03K19/17776 , H03K19/1778
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements is to perform an operation when an incoming operand set arrives at the plurality of processing elements. The processor also includes a streamer element to prefetch the incoming operand set from two or more levels of a memory system.
-
公开(公告)号:US20180188983A1
公开(公告)日:2018-07-05
申请号:US15396049
申请日:2016-12-30
Applicant: INTEL CORPORATION
Inventor: Kermin Elliott Fleming, JR. , Simon C. Steely, JR. , Kent D. Glossop
IPC: G06F3/06
Abstract: An integrated circuit includes a processor to execute instructions and to interact with memory, and acceleration hardware, to execute a sub-program corresponding to instructions. A set of input queues includes a store address queue to receive, from the acceleration hardware, a first address of the memory, the first address associated with a store operation and a store data queue to receive, from the acceleration hardware, first data to be stored at the first address of the memory. The set of input queues also includes a completion queue to buffer response data for a load operation. A disambiguator circuit, coupled to the set of input queues and the memory, is to, responsive to determining the load operation, which succeeds the store operation, has an address conflict with the first address, copy the first data from the store data queue into the completion queue for the load operation.
-
公开(公告)号:US12135981B2
公开(公告)日:2024-11-05
申请号:US18207870
申请日:2023-06-09
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
公开(公告)号:US11416281B2
公开(公告)日:2022-08-16
申请号:US16474978
申请日:2016-12-31
Applicant: Intel Corporation
Inventor: Rajesh M. Sankaran , Gilbert Neiger , Narayan Ranganathan , Stephen R. Van Doren , Joseph Nuzman , Niall D. McDonnell , Michael A. O'Hanlon , Lokpraveen B. Mosur , Tracy Garrett Drysdale , Eriko Nurvitadhi , Asit K. Mishra , Ganesh Venkatesh , Deborah T. Marr , Nicholas P. Carter , Jonathan D. Pearce , Edward T. Grochowski , Richard J. Greco , Robert Valentine , Jesus Corbal , Thomas D. Fletcher , Dennis R. Bradford , Dwight P. Manley , Mark J. Charney , Jeffrey J. Cook , Paul Caprioli , Koichi Yamada , Kent D. Glossop , David B. Sheffield
Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.
-
公开(公告)号:US11200186B2
公开(公告)日:2021-12-14
申请号:US16024854
申请日:2018-06-30
Applicant: Intel Corporation
Inventor: Kermin E. Fleming, Jr. , Simon C. Steely, Jr. , Kent D. Glossop , Mitchell Diamond , Benjamin Keen , Dennis Bradford , Fabrizio Petrini , Barry Tannenbaum , Yongzhi Zhang
Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
-
公开(公告)号:US10515049B1
公开(公告)日:2019-12-24
申请号:US15640541
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Kermin E. Fleming , Simon C. Steely , Kent D. Glossop
Abstract: Methods and apparatuses relating to distributed memory hazard detection and error recovery are described. In one embodiment, a memory circuit includes a memory interface circuit to service memory requests from a spatial array of processing elements for data stored in a plurality of cache banks; and a hazard detection circuit in each of the plurality of cache banks, wherein a first hazard detection circuit for a speculative memory load request from the memory interface circuit, that is marked with a potential dynamic data dependency, to an address within a first cache bank of the first hazard detection circuit, is to mark the address for tracking of other memory requests to the address, store data from the address in speculative completion storage, and send the data from the speculative completion storage to the spatial array of processing elements when a memory dependency token is received for the speculative memory load request.
-
公开(公告)号:US10445250B2
公开(公告)日:2019-10-15
申请号:US15859454
申请日:2017-12-30
Applicant: Intel Corporation
Inventor: Kermin E. Fleming , Kent D. Glossop , Simon C. Steely
IPC: G06F12/1045
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a core with a decoder to decode an instruction into a decoded instruction and an execution unit to execute the decoded instruction to perform a first operation; a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform a second operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements.
-
-
-
-
-
-
-
-
-