-
公开(公告)号:US12131258B2
公开(公告)日:2024-10-29
申请号:US17030315
申请日:2020-09-23
Applicant: QUALCOMM Incorporated
Inventor: Yadong Lu , Ying Wang , Tijmen Pieter Frederik Blankevoort , Christos Louizos , Matthias Reisser , Jilei Hou
Abstract: A method for compressing a deep neural network includes determining a pruning ratio for a channel and a mixed-precision quantization bit-width based on an operational budget of a device implementing the deep neural network. The method further includes quantizing a weight parameter of the deep neural network and/or an activation parameter of the deep neural network based on the quantization bit-width. The method also includes pruning the channel of the deep neural network based on the pruning ratio.
-
公开(公告)号:US11562208B2
公开(公告)日:2023-01-24
申请号:US16413535
申请日:2019-05-15
Applicant: QUALCOMM Incorporated
Abstract: A method for quantizing a neural network includes modeling noise of parameters of the neural network. The method also includes assigning grid values to each realization of the parameters according to a concrete distribution that depends on a local fixed-point quantization grid and the modeled noise and. The method further includes computing a fixed-point value representing parameters of a hard fixed-point quantized neural network.
-