-
公开(公告)号:US20190004878A1
公开(公告)日:2019-01-03
申请号:US15640542
申请日:2017-07-01
Applicant: Intel Corporation
Inventor: Michael C. Adler , Kermin Fleming , Kent D. Glossop , Simon C. Steely, JR.
Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of two dataflow graphs each comprising a plurality of nodes, wherein a first dataflow graph and a second dataflow graph are be overlaid into a first and second portion, respectively, of the interconnect network and a first and second subset, respectively, of the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the first and second subsets of the plurality of processing elements are to perform a first and second operation, respectively, when incoming first and second, respectively, operand sets arrive at the plurality of processing elements.
-
公开(公告)号:US20180113838A1
公开(公告)日:2018-04-26
申请号:US15637581
申请日:2017-06-29
Applicant: Intel Corporation
Inventor: William J. Butera , Simon C. Steely, JR. , Richard J. Dischler
IPC: G06F15/173
CPC classification number: G06F15/17381 , G06F9/38 , G06F9/3897 , G06F15/17343
Abstract: Embodiments relate to a computational device including multiple processor tiles on a die that may have multiple switchable topologies. A topology of the computational device may include one or more virtual circuits. A virtual circuit may include multiple processor tiles. A processor tile of a virtual circuit of a topology may include a configuration vector to control a connection between the processor tile and a neighboring processor tile. A first topology of the computation device may correspond to a first phase of a computation of a program, and a second topology of the computation device may correspond to a second phase of the computation of the program. Other embodiments may be described and/or claimed.
-
3.
公开(公告)号:US20180004510A1
公开(公告)日:2018-01-04
申请号:US15201442
申请日:2016-07-02
Applicant: Intel Corporation
Inventor: Edward T. Grochowski , Asit K. Mishra , Robert Valentine , Mark J. Charney , Simon C. Steely, JR.
CPC classification number: G06F9/3001 , G06F9/30036 , G06F9/30145 , G06F9/3861 , G06F9/3865
Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.
-
公开(公告)号:US20190384601A1
公开(公告)日:2019-12-19
申请号:US16537318
申请日:2019-08-09
Applicant: Intel Corporation
Inventor: William C. Hasenplaugh , Chris J. Newburn , Simon C. Steely, JR. , Samantika S. Sury
IPC: G06F9/30 , G06F12/0886 , G06F12/0897 , G06F12/126 , G06F12/1045
Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate a packed data register of the plurality of packed data registers that is to store a source packed memory address information. The source packed memory address information is to include a plurality of memory address information data elements. An execution unit is coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the instruction, is to load a plurality of data elements from a plurality of memory addresses that are each to correspond to a different one of the plurality of memory address information data elements, and store the plurality of loaded data elements in a destination storage location. The destination storage location does not include a register of the plurality of packed data registers.
-
公开(公告)号:US20190095383A1
公开(公告)日:2019-03-28
申请号:US15719281
申请日:2017-09-28
Applicant: Intel Corporation
Inventor: Kermin Fleming , Simon C. Steely, JR. , Kent D. Glossop
IPC: G06F15/80
Abstract: Systems, methods, and apparatuses relating to debugging a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. At least a first of the plurality of processing elements is to enter a halted state in response to being represented as a first of the plurality of dataflow operators.
-
公开(公告)号:US20180188997A1
公开(公告)日:2018-07-05
申请号:US15396038
申请日:2016-12-30
Applicant: INTEL CORPORATION
Inventor: Kermin Elliott Fleming, JR. , Simon C. Steely, JR. , Kent D. Glossop
IPC: G06F3/06
Abstract: An integrated circuit includes a memory interface, coupled to a memory to store data corresponding to instructions, and an operations queue to buffer memory operations corresponding to the instructions. The integrated circuit may include acceleration hardware to execute a sub-program corresponding to the instructions. A set of input queues may include an address queue to receive, from the acceleration hardware, an address of the memory associated with a second memory operation of the memory operations, and a dependency queue to receive, from the acceleration hardware, a dependency token associated with the address. The dependency token indicates a dependency on data generated by a first memory operation of the memory operations. A scheduler circuit may schedule issuance of the second memory operation to the memory in response to the dependency queue receiving the dependency token and the address queue receiving the address.
-
公开(公告)号:US20170351430A1
公开(公告)日:2017-12-07
申请号:US15170050
申请日:2016-06-01
Applicant: Intel Corporation
Inventor: Robert G. Blankenship , Simon C. Steely, JR. , Samantika S. Sury
IPC: G06F3/06 , G06F12/0808 , G06F12/0815 , G06F12/0811 , G06F12/0842
CPC classification number: G06F3/0605 , G06F3/0625 , G06F3/0659 , G06F3/0673 , G06F12/0808 , G06F12/0811 , G06F12/0815 , G06F12/0824 , G06F12/0826 , G06F12/0831 , G06F12/0842 , G06F2212/1028 , G06F2212/1048 , Y02D10/13
Abstract: Systems, methods, and apparatuses are directed to requesting access to a memory address; storing an identification of the memory address in a data structure; receiving a first request for access to the memory address, the request comprising a reference to a second processor core; storing the reference to the second processor in the data structure; receiving a second request for access to the memory address, the second request comprising a reference to a third processor core; determining, based on the data structure, that the third processor core is different from the second processor core; and responding to the second request without buffering the second request.
-
8.
公开(公告)号:US20240220323A1
公开(公告)日:2024-07-04
申请号:US18149045
申请日:2022-12-30
Applicant: Intel Corporation
Inventor: Gregory Henry , Kermin E. Chofleming , Simon C. Steely, JR.
CPC classification number: G06F9/5027 , G06F5/012 , G06F7/4876
Abstract: Systems, methods, and apparatuses relating to floating-point support circuitry to implement floating-point operations on a two-dimensional grid of fixed-point processing elements are described. In one example, a hardware processor includes a two-dimensional grid of fixed-point processing elements; floating-point support circuitry coupled to the two-dimensional grid of fixed-point processing elements; storage for a first, a second, and a destination two-dimensional floating-point matrices coupled to the floating-point support circuitry; and controller circuitry to cause the two-dimensional grid of fixed-point processing elements and the floating-point support circuitry to: determine, by the floating-point support circuitry, an extreme exponent for each row of the first two-dimensional floating-point matrix and for each column of the second two-dimensional floating-point matrix, generate, by the floating-point support circuitry, a first fixed-point matrix from the first two-dimensional floating-point matrix and a second fixed-point matrix from the second two-dimensional floating-point matrix, generate, by the two-dimensional grid of fixed-point processing elements, corresponding fixed-point results by a multiplication of fixed-point elements of the first fixed-point matrix by corresponding fixed-point elements of the second fixed-point matrix, scale, by the floating-point support circuitry, the corresponding fixed-point results according to the extreme exponents to generate scaled fixed-point results, generate, by the floating-point support circuitry, a resultant floating-point matrix from the scaled fixed-point results, and store the resultant floating-point matrix into the destination two-dimensional floating-point matrix.
-
公开(公告)号:US20210255674A1
公开(公告)日:2021-08-19
申请号:US17174106
申请日:2021-02-11
Applicant: Intel Corporation
Inventor: Simon C. Steely, JR. , Richard Dischler , David Bach , Olivier Franza , William J. Butera , Christian Karl , Benjamin Keen , Brian Leung
IPC: G06F1/18 , H01L23/538 , G06F15/76 , H01L25/065 , G06F9/50
Abstract: Embodiments herein may present an integrated circuit or a computing system having an integrated circuit, where the integrated circuit includes a physical network layer, a physical computing layer, and a physical memory layer, each having a set of dies, and a die including multiple tiles. The physical network layer further includes one or more signal pathways dynamically configurable between multiple pre-defined interconnect topologies for the multiple tiles, where each topology of the multiple pre-defined interconnect topologies corresponds to a communication pattern related to a workload. At least a tile in the physical computing layer is further arranged to move data to another tile in the physical computing layer or a storage cell of the physical memory layer through the one or more signal pathways in the physical network layer. Other embodiments may be described and/or claimed.
-
公开(公告)号:US20190354146A1
公开(公告)日:2019-11-21
申请号:US16416753
申请日:2019-05-20
Applicant: Intel Corporation
Inventor: Simon C. Steely, JR. , Richard Dischler , David Bach , Olivier Franza , William J. Butera , Christian Karl , Benjamin Keen , Brian Leung
IPC: G06F1/18 , H01L23/538 , H01L25/065 , G06F9/50 , G06F15/76
Abstract: Embodiments herein may present an integrated circuit or a computing system having an integrated circuit, where the integrated circuit includes a physical network layer, a physical computing layer, and a physical memory layer, each having a set of dies, and a die including multiple tiles. The physical network layer further includes one or more signal pathways dynamically configurable between multiple pre-defined interconnect topologies for the multiple tiles, where each topology of the multiple pre-defined interconnect topologies corresponds to a communication pattern related to a workload. At least a tile in the physical computing layer is further arranged to move data to another tile in the physical computing layer or a storage cell of the physical memory layer through the one or more signal pathways in the physical network layer. Other embodiments may be described and/or claimed.
-
-
-
-
-
-
-
-
-