Deep neural network hardware accelerator based on power exponential quantization

    公开(公告)号:US12154026B2

    公开(公告)日:2024-11-26

    申请号:US17284480

    申请日:2020-01-09

    Abstract: A deep neural network hardware accelerator comprises: an AXI-4 bus interface, an input cache area, an output cache area, a weighting cache area, a weighting index cache area, an encoding module, a configurable state controller module, and a PE array. The input cache area and the output cache area are designed as a line cache structure; an encoder encodes weightings according to an ordered quantization set, the quantization set storing the possible value of the absolute value of all of the weightings after quantization. During the calculation of the accelerator, the PE unit reads data from the input cache area and the weighting index cache area to perform shift calculation, and sends the calculation result to the output cache area. The accelerator uses shift operations to replace floating point multiplication operations, reducing the requirements for computing resources, storage resources, and communication bandwidth, and increasing the calculation efficiency of the accelerator.

Patent Agency Ranking