Sparsity-aware neural processing unit for performing constant probability index matching and processing method of the same
摘要:
A method of processing of a sparsity-aware neural processing unit includes receiving a plurality of input activations (IA); obtaining a weight having a non-zero value in each weight output channel; storing the weight and the IA in a memory, and obtaining an input channel index comprising a memory address location in which the weight and the IA are stored; and arranging the non-zero weight of each weight output channel according to a row size of an index matching unit (IMU) and matching the IA to the weight in the IMU comprising a buffer memory storing the input channel index.
信息查询
0/0