APPARATUSES, METHODS, AND SYSTEMS FOR INSTRUCTIONS FOR LOADING A TILE OF A MATRIX OPERATIONS ACCELERATOR

    公开(公告)号:US20240220323A1

    公开(公告)日:2024-07-04

    申请号:US18149045

    申请日:2022-12-30

    CPC classification number: G06F9/5027 G06F5/012 G06F7/4876

    Abstract: Systems, methods, and apparatuses relating to floating-point support circuitry to implement floating-point operations on a two-dimensional grid of fixed-point processing elements are described. In one example, a hardware processor includes a two-dimensional grid of fixed-point processing elements; floating-point support circuitry coupled to the two-dimensional grid of fixed-point processing elements; storage for a first, a second, and a destination two-dimensional floating-point matrices coupled to the floating-point support circuitry; and controller circuitry to cause the two-dimensional grid of fixed-point processing elements and the floating-point support circuitry to: determine, by the floating-point support circuitry, an extreme exponent for each row of the first two-dimensional floating-point matrix and for each column of the second two-dimensional floating-point matrix, generate, by the floating-point support circuitry, a first fixed-point matrix from the first two-dimensional floating-point matrix and a second fixed-point matrix from the second two-dimensional floating-point matrix, generate, by the two-dimensional grid of fixed-point processing elements, corresponding fixed-point results by a multiplication of fixed-point elements of the first fixed-point matrix by corresponding fixed-point elements of the second fixed-point matrix, scale, by the floating-point support circuitry, the corresponding fixed-point results according to the extreme exponents to generate scaled fixed-point results, generate, by the floating-point support circuitry, a resultant floating-point matrix from the scaled fixed-point results, and store the resultant floating-point matrix into the destination two-dimensional floating-point matrix.

    Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator

    公开(公告)号:US11907713B2

    公开(公告)日:2024-02-20

    申请号:US16729369

    申请日:2019-12-28

    CPC classification number: G06F9/3001 G06F9/30181 G06F15/80

    Abstract: Systems, methods, and apparatuses relating to a sign modification field for fused operations in a configurable spatial accelerator are described. In one embodiment, a hardware accelerator includes a plurality of processing elements; a network between the plurality of processing elements to transfer values between the plurality of processing elements; and a processing element of the plurality of processing elements comprising: a first plurality of input queues having a multiple bit width coupled to the network, at least one first output queue having the multiple bit width coupled to the network, operation circuitry coupled to the first plurality of input queues having the multiple bit width, a sign modification circuit coupled to the first plurality of input queues having the multiple bit width, and a configuration register within the processing element to store a configuration value comprising a sign modification field that causes the sign modification circuit to modify a sign bit of a value from the first plurality of input queues according to the sign modification field to create a sign modified value, and the configuration value causes the operation circuitry to perform a selected operation of a plurality of operations on a value from the first plurality of input queues and the sign modified value to create a resultant value, and store the resultant value in the at least one first output queue.

    Apparatuses, methods, and systems for configurable operand size operations in an operation configurable spatial accelerator

    公开(公告)号:US11029958B1

    公开(公告)日:2021-06-08

    申请号:US16729372

    申请日:2019-12-28

    Abstract: Systems, methods, and apparatuses relating to configurable operand size operation circuitry in an operation configurable spatial accelerator are described. In one embodiment, a hardware accelerator includes a plurality of processing elements, a network between the plurality of processing elements to transfer values between the plurality of processing elements, and a first processing element of the plurality of processing elements including a first plurality of input queues having a multiple bit width coupled to the network, at least one first output queue having the multiple bit width coupled to the network, configurable operand size operation circuitry coupled to the first plurality of input queues, and a configuration register within the first processing element to store a configuration value that causes the configurable operand size operation circuitry to switch to a first mode for a first multiple bit width from a plurality of selectable multiple bit widths of the configurable operand size operation circuitry, perform a selected operation on a plurality of first multiple bit width values from the first plurality of input queues in series to create a resultant value, and store the resultant value in the at least one first output queue.

Patent Agency Ranking