Systems, apparatuses, and methods for chained fused multiply add

    公开(公告)号:US11487541B2

    公开(公告)日:2022-11-01

    申请号:US17107134

    申请日:2020-11-30

    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

    Systems, apparatuses, and methods for chained fused multiply add

    公开(公告)号:US10146535B2

    公开(公告)日:2018-12-04

    申请号:US15299420

    申请日:2016-10-20

    Abstract: Embodiments of systems, apparatuses, and methods for chained fused multiply add. In some embodiments, an apparatus includes a decoder to decode a single instruction having an opcode, a destination field representing a destination operand, a first source field representing a plurality of packed data source operands of a first type that have packed data elements of a first size, a second source field representing a plurality of packed data source operands that have packed data elements of a second size, and a field for a memory location that stores a scalar value. A register file having a plurality of packed data registers includes registers for the plurality of packed data source operands that have packed data elements of a first size, the source operands that have packed data elements of a second size, and the destination operand. Execution circuitry executes the decoded single instruction to perform iterations of packed fused multiply accumulate operations by multiplying packed data elements of the sources of the first type by sub-elements of the scalar value, and adding results of these multiplications to an initial value in a first iteration and a result from a previous iteration in subsequent iterations.

    INSTRUCTION AND LOGIC FOR MULTIPLIER SELECTORS FOR MERGING MATH FUNCTIONS
    4.
    发明申请
    INSTRUCTION AND LOGIC FOR MULTIPLIER SELECTORS FOR MERGING MATH FUNCTIONS 有权
    用于合并数学函数的多用户选择器的指令和逻辑

    公开(公告)号:US20160092215A1

    公开(公告)日:2016-03-31

    申请号:US14498126

    申请日:2014-09-26

    Abstract: A processor includes a front end with logic to identify a multiplier, multiplicand, and mathematical mode based upon an instruction. The processor also includes a multiplier circuit to apply Booth encoding to multiply the multiplier and multiplicand. The multiplier circuit includes circuitry to determine leftmost and rightmost partial products of multiplying the multiplier and multiplicand using Booth encoding. The circuitry includes a most significant bit (MSB) array and least significant bit (LSB) array corresponding to the multiplier. The multiplier circuit also includes logic to selectively enable selectors of the circuitry to find partial products based upon the mathematical mode of the instruction.

    Abstract translation: 处理器包括前端,其具有基于指令来识别乘法器,被乘数和数学模式的逻辑。 处理器还包括一个乘法器电路,用于应用布斯编码来乘法乘法和乘法。 乘法器电路包括用于确定使用布斯编码乘以乘法器和被乘数的最左和最右部分乘积的电路。 电路包括对应于乘法器的最高有效位(MSB)阵列和最低有效位(LSB)阵列)。 乘法器电路还包括用于选择性地使电路的选择器能够基于指令的数学模式来查找部分乘积的逻辑。

    Instruction and logic for multiplier selectors for merging math functions
    8.
    发明授权
    Instruction and logic for multiplier selectors for merging math functions 有权
    用于合并数学函数的乘法器选择器的指令和逻辑

    公开(公告)号:US09588765B2

    公开(公告)日:2017-03-07

    申请号:US14498126

    申请日:2014-09-26

    Abstract: A processor includes a front end with logic to identify a multiplier, multiplicand, and mathematical mode based upon an instruction. The processor also includes a multiplier circuit to apply Booth encoding to multiply the multiplier and multiplicand. The multiplier circuit includes circuitry to determine leftmost and rightmost partial products of multiplying the multiplier and multiplicand using Booth encoding. The circuitry includes a most significant bit (MSB) array and least significant bit (LSB) array corresponding to the multiplier. The multiplier circuit also includes logic to selectively enable selectors of the circuitry to find partial products based upon the mathematical mode of the instruction.

    Abstract translation: 处理器包括前端,其具有基于指令来识别乘法器,被乘数和数学模式的逻辑。 处理器还包括一个乘法器电路,用于应用布斯编码来乘法乘法和乘法。 乘法器电路包括用于确定使用布斯编码乘以乘法器和被乘数的最左和最右部分乘积的电路。 电路包括对应于乘法器的最高有效位(MSB)阵列和最低有效位(LSB)阵列)。 乘法器电路还包括用于选择性地使电路的选择器能够基于指令的数学模式来查找部分乘积的逻辑。

    FUNCTIONAL UNIT CAPABLE OF EXECUTING APPROXIMATIONS OF FUNCTIONS
    9.
    发明申请
    FUNCTIONAL UNIT CAPABLE OF EXECUTING APPROXIMATIONS OF FUNCTIONS 有权
    具有执行功能近似功能的功能单元

    公开(公告)号:US20140201504A1

    公开(公告)日:2014-07-17

    申请号:US14216884

    申请日:2014-03-17

    Abstract: A semiconductor chip is described having a functional unit that can execute a first instruction and execute a second instruction. The first instruction is an instruction that multiplies two operands. The second instruction is an instruction that approximates a function according to C0+C1X2+C2X22. The functional unit has a multiplier circuit. The multiplier circuit has: i) a first input to receive bits of a first operand of the first instruction and receive bits of a C1 term of the second instruction; ii) a second input to receive bits of a second operand of the first instruction and receive bits of a X2 term of the second instruction.

    Abstract translation: 描述了具有可执行第一指令并执行第二指令的功能单元的半导体芯片。 第一条指令是将两个操作数相乘的指令。 第二条指令是根据C0 + C1X2 + C2X22近似函数的指令。 功能单元具有乘法电路。 所述乘法器电路具有:i)第一输入,用于接收所述第一指令的第一操作数的比特并接收所述第二指令的C1项的比特; ii)用于接收第一指令的第二操作数的比特并接收第二指令的X2项的比特的第二输入。

Patent Agency Ranking