-
1.
公开(公告)号:US20240220323A1
公开(公告)日:2024-07-04
申请号:US18149045
申请日:2022-12-30
Applicant: Intel Corporation
Inventor: Gregory Henry , Kermin E. Chofleming , Simon C. Steely, JR.
CPC classification number: G06F9/5027 , G06F5/012 , G06F7/4876
Abstract: Systems, methods, and apparatuses relating to floating-point support circuitry to implement floating-point operations on a two-dimensional grid of fixed-point processing elements are described. In one example, a hardware processor includes a two-dimensional grid of fixed-point processing elements; floating-point support circuitry coupled to the two-dimensional grid of fixed-point processing elements; storage for a first, a second, and a destination two-dimensional floating-point matrices coupled to the floating-point support circuitry; and controller circuitry to cause the two-dimensional grid of fixed-point processing elements and the floating-point support circuitry to: determine, by the floating-point support circuitry, an extreme exponent for each row of the first two-dimensional floating-point matrix and for each column of the second two-dimensional floating-point matrix, generate, by the floating-point support circuitry, a first fixed-point matrix from the first two-dimensional floating-point matrix and a second fixed-point matrix from the second two-dimensional floating-point matrix, generate, by the two-dimensional grid of fixed-point processing elements, corresponding fixed-point results by a multiplication of fixed-point elements of the first fixed-point matrix by corresponding fixed-point elements of the second fixed-point matrix, scale, by the floating-point support circuitry, the corresponding fixed-point results according to the extreme exponents to generate scaled fixed-point results, generate, by the floating-point support circuitry, a resultant floating-point matrix from the scaled fixed-point results, and store the resultant floating-point matrix into the destination two-dimensional floating-point matrix.
-
公开(公告)号:US11907713B2
公开(公告)日:2024-02-20
申请号:US16729369
申请日:2019-12-28
Applicant: Intel Corporation
Inventor: Kermin E. Chofleming , Chuanjun Zhang , Daniel Towner , Simon C. Steely, Jr. , Benjamin Keen
CPC classification number: G06F9/3001 , G06F9/30181 , G06F15/80
Abstract: Systems, methods, and apparatuses relating to a sign modification field for fused operations in a configurable spatial accelerator are described. In one embodiment, a hardware accelerator includes a plurality of processing elements; a network between the plurality of processing elements to transfer values between the plurality of processing elements; and a processing element of the plurality of processing elements comprising: a first plurality of input queues having a multiple bit width coupled to the network, at least one first output queue having the multiple bit width coupled to the network, operation circuitry coupled to the first plurality of input queues having the multiple bit width, a sign modification circuit coupled to the first plurality of input queues having the multiple bit width, and a configuration register within the processing element to store a configuration value comprising a sign modification field that causes the sign modification circuit to modify a sign bit of a value from the first plurality of input queues according to the sign modification field to create a sign modified value, and the configuration value causes the operation circuitry to perform a selected operation of a plurality of operations on a value from the first plurality of input queues and the sign modified value to create a resultant value, and store the resultant value in the at least one first output queue.
-
公开(公告)号:US11029958B1
公开(公告)日:2021-06-08
申请号:US16729372
申请日:2019-12-28
Applicant: Intel Corporation
Inventor: Chuanjun Zhang , Kermin E. Chofleming
IPC: G06F9/30
Abstract: Systems, methods, and apparatuses relating to configurable operand size operation circuitry in an operation configurable spatial accelerator are described. In one embodiment, a hardware accelerator includes a plurality of processing elements, a network between the plurality of processing elements to transfer values between the plurality of processing elements, and a first processing element of the plurality of processing elements including a first plurality of input queues having a multiple bit width coupled to the network, at least one first output queue having the multiple bit width coupled to the network, configurable operand size operation circuitry coupled to the first plurality of input queues, and a configuration register within the first processing element to store a configuration value that causes the configurable operand size operation circuitry to switch to a first mode for a first multiple bit width from a plurality of selectable multiple bit widths of the configurable operand size operation circuitry, perform a selected operation on a plurality of first multiple bit width values from the first plurality of input queues in series to create a resultant value, and store the resultant value in the at least one first output queue.
-
-