-
公开(公告)号:US12154026B2
公开(公告)日:2024-11-26
申请号:US17284480
申请日:2020-01-09
Applicant: SOUTHEAST UNIVERSITY
Inventor: Shengli Lu , Wei Pang , Ruili Wu , Yingbo Fan , Hao Liu , Cheng Huang
Abstract: A deep neural network hardware accelerator comprises: an AXI-4 bus interface, an input cache area, an output cache area, a weighting cache area, a weighting index cache area, an encoding module, a configurable state controller module, and a PE array. The input cache area and the output cache area are designed as a line cache structure; an encoder encodes weightings according to an ordered quantization set, the quantization set storing the possible value of the absolute value of all of the weightings after quantization. During the calculation of the accelerator, the PE unit reads data from the input cache area and the weighting index cache area to perform shift calculation, and sends the calculation result to the output cache area. The accelerator uses shift operations to replace floating point multiplication operations, reducing the requirements for computing resources, storage resources, and communication bandwidth, and increasing the calculation efficiency of the accelerator.