Invention Grant
- Patent Title: Joint pruning and quantization scheme for deep neural networks
-
Application No.: US17030315Application Date: 2020-09-23
-
Publication No.: US12131258B2Publication Date: 2024-10-29
- Inventor: Yadong Lu , Ying Wang , Tijmen Pieter Frederik Blankevoort , Christos Louizos , Matthias Reisser , Jilei Hou
- Applicant: QUALCOMM Incorporated
- Applicant Address: US CA San Diego
- Assignee: QUALCOMM Incorporated
- Current Assignee: QUALCOMM Incorporated
- Current Assignee Address: US CA San Diego
- Agency: Seyfarth Shaw LLP
- Priority: GR 190100410 2019.09.24
- Main IPC: G06N3/082
- IPC: G06N3/082 ; G06N3/04 ; G06N3/063

Abstract:
A method for compressing a deep neural network includes determining a pruning ratio for a channel and a mixed-precision quantization bit-width based on an operational budget of a device implementing the deep neural network. The method further includes quantizing a weight parameter of the deep neural network and/or an activation parameter of the deep neural network based on the quantization bit-width. The method also includes pruning the channel of the deep neural network based on the pruning ratio.
Public/Granted literature
- US20210089922A1 JOINT PRUNING AND QUANTIZATION SCHEME FOR DEEP NEURAL NETWORKS Public/Granted day:2021-03-25
Information query