Using a low-bit-width dot product engine to sum high-bit-width numbers

    公开(公告)号:US11455143B2

    公开(公告)日:2022-09-27

    申请号:US16869281

    申请日:2020-05-07

    IPC分类号: G06F7/544 G06F7/483

    摘要: A device (e.g., an integrated circuit chip) includes a dot product processing component, a data alignment component, and an accumulator. The dot product processing component is configured to calculate a dot product of a first group of elements stored in a first storage unit with a second group of elements, wherein: each element of the first group of elements is represented using a first number of bits, each value of a group of values stored in the first storage unit is represented using a second number of bits greater than the first number of bits, and each value of the group of values is stored as split segments across more than one element of the elements of the first group of elements. The data alignment component is configured to receive results of the dot product processing component and modify one or more of the results of the dot product processing component. The accumulator is configured to sum outputs of the data alignment component to at least in part determine a sum of the group of values.

    FLEXIBLE MATRIX PROCESSING
    3.
    发明公开

    公开(公告)号:US20240095304A1

    公开(公告)日:2024-03-21

    申请号:US18382891

    申请日:2023-10-23

    IPC分类号: G06F17/16 G06F7/78

    CPC分类号: G06F17/16 G06F7/78

    摘要: A system includes a matrix transpose component, a matrix processing component, a data modification component, and a data reduction component. The matrix transpose component is configured to transpose a stored matrix to an output matrix. The matrix processing component is configured to multiply the output matrix with a mask vector to determine a result vector. The data modification component is configured to modify at least a portion of the result vector to determine a modified vector. The data reduction component is configured to sum at least a portion of elements included in the modified vector.

    Device and method for flexibly summing matrix values

    公开(公告)号:US11829441B2

    公开(公告)日:2023-11-28

    申请号:US17834203

    申请日:2022-06-07

    IPC分类号: G06F17/16 G06F7/78

    CPC分类号: G06F17/16 G06F7/78

    摘要: A device includes a matrix transpose component, a matrix processing component, a data alignment component, and a data reduction component. The matrix transpose component is configured to transpose an input matrix of elements to output an output matrix of the elements that have been transposed. The matrix processing component is configured to multiply a first multiplication input matrix with a second multiplication input matrix, wherein the output matrix of the matrix transpose component is utilized as the first multiplication input matrix and a mask vector is utilized as the second multiplication input matrix. The data alignment component is configured to modify at least a portion of elements of a result of the matrix processing component. The data reduction component is configured to sum at least the elements of the modified result of the matrix processing component to determine a sum of the group of values.

    DEVICE AND METHOD FOR FLEXIBLY SUMMING MATRIX VALUES

    公开(公告)号:US20220374499A1

    公开(公告)日:2022-11-24

    申请号:US17834203

    申请日:2022-06-07

    IPC分类号: G06F17/16 G06F7/78

    摘要: A device includes a matrix transpose component, a matrix processing component, a data alignment component, and a data reduction component. The matrix transpose component is configured to transpose an input matrix of elements to output an output matrix of the elements that have been transposed. The matrix processing component is configured to multiply a first multiplication input matrix with a second multiplication input matrix, wherein the output matrix of the matrix transpose component is utilized as the first multiplication input matrix and a mask vector is utilized as the second multiplication input matrix. The data alignment component is configured to modify at least a portion of elements of a result of the matrix processing component. The data reduction component is configured to sum at least the elements of the modified result of the matrix processing component to determine a sum of the group of values.