DEVICE AND METHOD FOR FLEXIBLY SUMMING MATRIX VALUES

    公开(公告)号:US20220374499A1

    公开(公告)日:2022-11-24

    申请号:US17834203

    申请日:2022-06-07

    IPC分类号: G06F17/16 G06F7/78

    摘要: A device includes a matrix transpose component, a matrix processing component, a data alignment component, and a data reduction component. The matrix transpose component is configured to transpose an input matrix of elements to output an output matrix of the elements that have been transposed. The matrix processing component is configured to multiply a first multiplication input matrix with a second multiplication input matrix, wherein the output matrix of the matrix transpose component is utilized as the first multiplication input matrix and a mask vector is utilized as the second multiplication input matrix. The data alignment component is configured to modify at least a portion of elements of a result of the matrix processing component. The data reduction component is configured to sum at least the elements of the modified result of the matrix processing component to determine a sum of the group of values.

    Bypassing zero-value multiplications in a hardware multiplier

    公开(公告)号:US11614920B2

    公开(公告)日:2023-03-28

    申请号:US16869288

    申请日:2020-05-07

    摘要: A device (e.g., integrated circuit chip) includes a first operand register, a second operand register, a multiplication unit, and a hardware logic component. The first operand register is configured to store a first operand value. The second operand register is configured to store a second operand value. The multiplication unit is configured to at least multiply the first operand value with the second operand value. The hardware logic component is configured to detect whether a zero value is provided and in response to a detection that the zero value is being provided: cause an update of at least the first operand register to be disabled, and cause a result of a multiplication of the first operand value with the second operand value to be a zero-value result.

    Device and method for flexibly summing matrix values

    公开(公告)号:US11379557B2

    公开(公告)日:2022-07-05

    申请号:US16869303

    申请日:2020-05-07

    IPC分类号: G06F17/16 G06F7/78

    摘要: A device includes a matrix transpose component, a matrix processing component, a data alignment component, and a data reduction component. The matrix transpose component is configured to transpose an input matrix of elements to output an output matrix of the elements that have been transposed. The matrix processing component is configured to multiply a first multiplication input matrix with a second multiplication input matrix, wherein the output matrix of the matrix transpose component is utilized as the first multiplication input matrix and a mask vector is utilized as the second multiplication input matrix. The data alignment component is configured to modify at least a portion of elements of a result of the matrix processing component. The data reduction component is configured to sum at least the elements of the modified result of the matrix processing component to determine a sum of the group of values.

    USING A LOW-BIT-WIDTH DOT PRODUCT ENGINE TO SUM HIGH-BIT-WIDTH NUMBERS

    公开(公告)号:US20230056304A1

    公开(公告)日:2023-02-23

    申请号:US17894431

    申请日:2022-08-24

    IPC分类号: G06F7/544 G06F7/483

    摘要: A system includes a vector multiplier configured to multiply a first vector of integer elements with a second vector of integer elements to determine a resulting vector of integer elements, wherein integer elements of the first and second vectors of integer elements are represented using a first number of bits and an integer element of the first vector of integer elements represents a portion of a value of a group of values. The system further includes a vector adder configured to add together the integer elements of the resulting vector of integer elements to determine a summed result, a bit shifter configured to shift bits of the summed result leftward, and an accumulator configured to determine an accumulated output sum that includes the leftward-shifted summed result.

    Support for different matrix multiplications by selecting adder tree intermediate results

    公开(公告)号:US11520854B2

    公开(公告)日:2022-12-06

    申请号:US16667700

    申请日:2019-10-29

    IPC分类号: G06F17/16

    摘要: A first group of elements is element-wise multiplied with a second group of elements using a plurality of multipliers belonging to a matrix multiplication hardware unit. Results of the plurality of multipliers are added together using a hierarchical tree of adders belonging to the matrix multiplication hardware unit and a final result of the hierarchical tree of adders or any of a plurality of intermediate results of the hierarchical tree of adders is selectively provided for use in determining an output result matrix. A control unit is used to instruct the matrix multiplication hardware unit to perform a plurality of different matrix multiplications in parallel by using a combined matrix that includes elements of a plurality of different operand matrices and utilize one or more selected ones of the intermediate results of the hierarchical tree of adders for use in determining the output result matrix that includes different groups of elements representing different multiplication results corresponding to different ones of the different operand matrices.

    Hardware for floating-point arithmetic in multiple formats

    公开(公告)号:US11275560B2

    公开(公告)日:2022-03-15

    申请号:US16795097

    申请日:2020-02-19

    IPC分类号: G06F7/487 G06F7/485 H03M7/24

    摘要: A floating-point number in a first format representation is received. Based on an identification of a floating-point format type of the floating-point number, different components of the first format representation are identified. The different components of the first format representation are placed in corresponding components of a second format representation of the floating-point number, wherein a total number of bits of the second format representation is larger than a total number of bits of the first format representation. At least one of the components of the second format representation is padded with one or more zero bits. The floating-point number in the second format representation is stored in a register. A multiplication using the second format representation of the floating-point number is performed.