-
公开(公告)号:US20180107630A1
公开(公告)日:2018-04-19
申请号:US15590798
申请日:2017-05-09
Inventor: Ni Zhou , Wei Qi , Yong Wang , Jian Ouyang
CPC classification number: G06F17/16 , G06F9/3895 , G06N99/005
Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n-1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
-
公开(公告)号:US10140251B2
公开(公告)日:2018-11-27
申请号:US15590798
申请日:2017-05-09
Inventor: Ni Zhou , Wei Qi , Yong Wang , Jian Ouyang
Abstract: A processor and a method for executing a matrix multiplication operation on a processor. A specific implementation of the processor includes a data bus and an array processor having k processing units. The data bus is configured to sequentially read n columns of row vectors from an M×N multiplicand matrix and input same to each processing unit in the array processor, read an n×k submatrix from an N×K multiplier matrix and input each column vector of the submatrix to a corresponding processing unit in the array processor, and output a result obtained by each processing unit after executing a multiplication operation. Each processing unit in the array processor is configured to execute in parallel a vector multiplication operation on the input row and column vectors. Each processing unit includes a Wallace tree multiplier having n multipliers and n−1 adders. This implementation improves the processing efficiency of a matrix multiplication operation.
-
公开(公告)号:US09912349B1
公开(公告)日:2018-03-06
申请号:US15628455
申请日:2017-06-20
Inventor: Jian Ouyang , Ni Zhou , Yong Wang , Wei Qi
CPC classification number: H03M7/30 , G06F9/30018 , G06F9/30145 , G06F9/30149 , G06F9/30174 , G06F9/3851 , G06F17/16 , G06N3/02 , G06T15/005 , H03M7/24
Abstract: The present disclosure provides a method and apparatus for processing a floating point number matrix, an apparatus and a computer readable storage medium. In embodiments of the present disclosure, the minimum value of the floating point number model matrix and the maximum value of the floating point number model matrix are obtained according to a floating point number model matrix to be compressed, and then, compression processing is performed for the floating point number model matrix to obtain the fixed point number model matrix according to the bit width, the minimum value of the floating point number model matrix and the maximum value of the floating point number model matrix. The compression processing is performed for the floating point number model matrix of the deep learning model by a fixed point method, to obtain the fixed point number model matrix and reduce the storage space and amount of operation of the deep learning model. Meanwhile, the present disclosure proposes a framework for implementing the apparatus in the deep learning network to maximize the deep learning network precision, that is, a multiplication portion of the matrix uses the apparatus, and operations of other portions such as activation function retain the floating point operation.
-
-