-
公开(公告)号:US20240046097A1
公开(公告)日:2024-02-08
申请号:US17817662
申请日:2022-08-05
Applicant: International Business Machines Corporation
Inventor: De Gao Chu , Lin Dong , Xiao Tian Xu , Xue Yin Zhuang
CPC classification number: G06N3/082 , G06K9/6228 , G06K9/6262
Abstract: A computer-implemented method for compressing a machine learning model includes converting an input machine learning model into a standard machine learning model. The method further includes converting the standard machine learning model into a plurality of pruned machine learning models, each of the pruned machine learning models converted using a corresponding pruning ratio from a pruning ratio candidate list. The method further includes determining, for each of the pruned machine learning models, a size-to-error ratio. The method further includes selecting, based on the size-to-error ratio of the pruned machine learning models, a first pruning ratio from the pruning ratio candidate list. The method further includes generating a compressed machine learning model by compressing the input machine learning model using the first pruning ratio that is selected. The method further includes deploying the compressed machine learning model for production.