Patent search ap:("INTEL CORPORATION") AND inv:"Simon C. Steely Page JR."

1.

发明申请
PROCESSORS, METHODS, AND SYSTEMS FOR A CONFIGURABLE SPATIAL ACCELERATOR WITH SECURITY, POWER REDUCTION, AND PERFORMACE FEATURES 审中-公开

公开(公告)号：US20190004878A1

公开(公告)日：2019-01-03

申请号：US15640542

申请日：2017-07-01

Applicant: Intel Corporation

Inventor： Michael C. Adler , Kermin Fleming , Kent D. Glossop , Simon C. Steely, JR.

IPC: G06F9/54 , G06F9/30 , G06F21/62 , G06F13/16

Abstract: Systems, methods, and apparatuses relating to a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements; and an interconnect network between the plurality of processing elements to receive an input of two dataflow graphs each comprising a plurality of nodes, wherein a first dataflow graph and a second dataflow graph are be overlaid into a first and second portion, respectively, of the interconnect network and a first and second subset, respectively, of the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the first and second subsets of the plurality of processing elements are to perform a first and second operation, respectively, when incoming first and second, respectively, operand sets arrive at the plurality of processing elements.

2.

发明申请
SWITCHABLE TOPOLOGY MACHINE 审中-公开

公开(公告)号：US20180113838A1

公开(公告)日：2018-04-26

申请号：US15637581

申请日：2017-06-29

Applicant: Intel Corporation

Inventor： William J. Butera , Simon C. Steely, JR. , Richard J. Dischler

IPC: G06F15/173

CPC classification number: G06F15/17381 , G06F9/38 , G06F9/3897 , G06F15/17343

Abstract: Embodiments relate to a computational device including multiple processor tiles on a die that may have multiple switchable topologies. A topology of the computational device may include one or more virtual circuits. A virtual circuit may include multiple processor tiles. A processor tile of a virtual circuit of a topology may include a configuration vector to control a connection between the processor tile and a neighboring processor tile. A first topology of the computation device may correspond to a first phase of a computation of a program, and a second topology of the computation device may correspond to a second phase of the computation of the program. Other embodiments may be described and/or claimed.

3.

发明申请
INTERRUPTIBLE AND RESTARTABLE MATRIX MULTIPLICATION INSTRUCTIONS, PROCESSORS, METHODS, AND SYSTEMS 审中-公开

公开(公告)号：US20180004510A1

公开(公告)日：2018-01-04

申请号：US15201442

申请日：2016-07-02

Applicant: Intel Corporation

Inventor： Edward T. Grochowski , Asit K. Mishra , Robert Valentine , Mark J. Charney , Simon C. Steely, JR.

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/3001 , G06F9/30036 , G06F9/30145 , G06F9/3861 , G06F9/3865

Abstract: A processor of an aspect includes a decode unit to decode a matrix multiplication instruction. The matrix multiplication instruction is to indicate a first memory location of a first source matrix, is to indicate a second memory location of a second source matrix, and is to indicate a third memory location where a result matrix is to be stored. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the matrix multiplication instruction, is to multiply a portion of the first and second source matrices prior to an interruption, and store a completion progress indicator in response to the interruption. The completion progress indicator to indicate an amount of progress in multiplying the first and second source matrices, and storing corresponding result data to the third memory location, that is to have been completed prior to the interruption.

4.

发明申请
PROCESSORS, METHODS, SYSTEMS, AND INSTRUCTIONS TO LOAD MULTIPLE DATA ELEMENTS TO DESTINATION STORAGE LOCATIONS OTHER THAN PACKED DATA REGISTERS 审中-公开

公开(公告)号：US20190384601A1

公开(公告)日：2019-12-19

申请号：US16537318

申请日：2019-08-09

Applicant: Intel Corporation

Inventor： William C. Hasenplaugh , Chris J. Newburn , Simon C. Steely, JR. , Samantika S. Sury

IPC: G06F9/30 , G06F12/0886 , G06F12/0897 , G06F12/126 , G06F12/1045

Abstract: A processor of an aspect includes a plurality of packed data registers, and a decode unit to decode an instruction. The instruction is to indicate a packed data register of the plurality of packed data registers that is to store a source packed memory address information. The source packed memory address information is to include a plurality of memory address information data elements. An execution unit is coupled with the decode unit and the plurality of packed data registers, the execution unit, in response to the instruction, is to load a plurality of data elements from a plurality of memory addresses that are each to correspond to a different one of the plurality of memory address information data elements, and store the plurality of loaded data elements in a destination storage location. The destination storage location does not include a register of the plurality of packed data registers.

5.

发明申请
PROCESSORS, METHODS, AND SYSTEMS FOR DEBUGGING A CONFIGURABLE SPATIAL ACCELERATOR 审中-公开

公开(公告)号：US20190095383A1

公开(公告)日：2019-03-28

申请号：US15719281

申请日：2017-09-28

Applicant: Intel Corporation

Inventor： Kermin Fleming , Simon C. Steely, JR. , Kent D. Glossop

IPC: G06F15/80

Abstract: Systems, methods, and apparatuses relating to debugging a configurable spatial accelerator are described. In one embodiment, a processor includes a plurality of processing elements and an interconnect network between the plurality of processing elements to receive an input of a dataflow graph comprising a plurality of nodes, wherein the dataflow graph is to be overlaid into the interconnect network and the plurality of processing elements with each node represented as a dataflow operator in the plurality of processing elements, and the plurality of processing elements are to perform an operation by a respective, incoming operand set arriving at each of the dataflow operators of the plurality of processing elements. At least a first of the plurality of processing elements is to enter a halted state in response to being represented as a first of the plurality of dataflow operators.

6.

发明申请
MEMORY ORDERING IN ACCELERATION HARDWARE 审中-公开

公开(公告)号：US20180188997A1

公开(公告)日：2018-07-05

申请号：US15396038

申请日：2016-12-30

Applicant: INTEL CORPORATION

Inventor： Kermin Elliott Fleming, JR. , Simon C. Steely, JR. , Kent D. Glossop

IPC: G06F3/06

Abstract: An integrated circuit includes a memory interface, coupled to a memory to store data corresponding to instructions, and an operations queue to buffer memory operations corresponding to the instructions. The integrated circuit may include acceleration hardware to execute a sub-program corresponding to the instructions. A set of input queues may include an address queue to receive, from the acceleration hardware, an address of the memory associated with a second memory operation of the memory operations, and a dependency queue to receive, from the acceleration hardware, a dependency token associated with the address. The dependency token indicates a dependency on data generated by a first memory operation of the memory operations. A scheduler circuit may schedule issuance of the second memory operation to the memory in response to the dependency queue receiving the dependency token and the address queue receiving the address.

7.

发明申请
METHOD, APPARATUS, AND SYSTEM FOR CACHE COHERENCY USING A COARSE DIRECTORY 审中-公开

公开(公告)号：US20170351430A1

公开(公告)日：2017-12-07

申请号：US15170050

申请日：2016-06-01

Applicant: Intel Corporation

Inventor： Robert G. Blankenship , Simon C. Steely, JR. , Samantika S. Sury

IPC: G06F3/06 , G06F12/0808 , G06F12/0815 , G06F12/0811 , G06F12/0842

CPC classification number: G06F3/0605 , G06F3/0625 , G06F3/0659 , G06F3/0673 , G06F12/0808 , G06F12/0811 , G06F12/0815 , G06F12/0824 , G06F12/0826 , G06F12/0831 , G06F12/0842 , G06F2212/1028 , G06F2212/1048 , Y02D10/13

Abstract: Systems, methods, and apparatuses are directed to requesting access to a memory address; storing an identification of the memory address in a data structure; receiving a first request for access to the memory address, the request comprising a reference to a second processor core; storing the reference to the second processor in the data structure; receiving a second request for access to the memory address, the second request comprising a reference to a third processor core; determining, based on the data structure, that the third processor core is different from the second processor core; and responding to the second request without buffering the second request.

8.

发明公开
APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR LOADING A TILE OF A MATRIX OPERATIONS ACCELERATOR 审中-公开

公开(公告)号：US20240220323A1

公开(公告)日：2024-07-04

申请号：US18149045

申请日：2022-12-30

Applicant: Intel Corporation

Inventor： Gregory Henry , Kermin E. Chofleming , Simon C. Steely, JR.

IPC: G06F9/50 , G06F5/01 , G06F7/487

CPC classification number: G06F9/5027 , G06F5/012 , G06F7/4876

Abstract: Systems, methods, and apparatuses relating to floating-point support circuitry to implement floating-point operations on a two-dimensional grid of fixed-point processing elements are described. In one example, a hardware processor includes a two-dimensional grid of fixed-point processing elements; floating-point support circuitry coupled to the two-dimensional grid of fixed-point processing elements; storage for a first, a second, and a destination two-dimensional floating-point matrices coupled to the floating-point support circuitry; and controller circuitry to cause the two-dimensional grid of fixed-point processing elements and the floating-point support circuitry to: determine, by the floating-point support circuitry, an extreme exponent for each row of the first two-dimensional floating-point matrix and for each column of the second two-dimensional floating-point matrix, generate, by the floating-point support circuitry, a first fixed-point matrix from the first two-dimensional floating-point matrix and a second fixed-point matrix from the second two-dimensional floating-point matrix, generate, by the two-dimensional grid of fixed-point processing elements, corresponding fixed-point results by a multiplication of fixed-point elements of the first fixed-point matrix by corresponding fixed-point elements of the second fixed-point matrix, scale, by the floating-point support circuitry, the corresponding fixed-point results according to the extreme exponents to generate scaled fixed-point results, generate, by the floating-point support circuitry, a resultant floating-point matrix from the scaled fixed-point results, and store the resultant floating-point matrix into the destination two-dimensional floating-point matrix.

9.

发明申请
LAYERED SUPER-RETICLE COMPUTING : ARCHITECTURES AND METHODS 有权

公开(公告)号：US20210255674A1

公开(公告)日：2021-08-19

申请号：US17174106

申请日：2021-02-11

Applicant: Intel Corporation

Inventor： Simon C. Steely, JR. , Richard Dischler , David Bach , Olivier Franza , William J. Butera , Christian Karl , Benjamin Keen , Brian Leung

IPC: G06F1/18 , H01L23/538 , G06F15/76 , H01L25/065 , G06F9/50

Abstract: Embodiments herein may present an integrated circuit or a computing system having an integrated circuit, where the integrated circuit includes a physical network layer, a physical computing layer, and a physical memory layer, each having a set of dies, and a die including multiple tiles. The physical network layer further includes one or more signal pathways dynamically configurable between multiple pre-defined interconnect topologies for the multiple tiles, where each topology of the multiple pre-defined interconnect topologies corresponds to a communication pattern related to a workload. At least a tile in the physical computing layer is further arranged to move data to another tile in the physical computing layer or a storage cell of the physical memory layer through the one or more signal pathways in the physical network layer. Other embodiments may be described and/or claimed.

10.

发明申请
LAYERED SUPER-RETICLE COMPUTING : ARCHITECTURES AND METHODS 审中-公开

公开(公告)号：US20190354146A1

公开(公告)日：2019-11-21

申请号：US16416753

申请日：2019-05-20

Applicant: Intel Corporation

Inventor： Simon C. Steely, JR. , Richard Dischler , David Bach , Olivier Franza , William J. Butera , Christian Karl , Benjamin Keen , Brian Leung

IPC: G06F1/18 , H01L23/538 , H01L25/065 , G06F9/50 , G06F15/76

Abstract: Embodiments herein may present an integrated circuit or a computing system having an integrated circuit, where the integrated circuit includes a physical network layer, a physical computing layer, and a physical memory layer, each having a set of dies, and a die including multiple tiles. The physical network layer further includes one or more signal pathways dynamically configurable between multiple pre-defined interconnect topologies for the multiple tiles, where each topology of the multiple pre-defined interconnect topologies corresponds to a communication pattern related to a workload. At least a tile in the physical computing layer is further arranged to move data to another tile in the physical computing layer or a storage cell of the physical memory layer through the one or more signal pathways in the physical network layer. Other embodiments may be described and/or claimed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification