AUTOMATIC COMPRESSION OF MACHINE LEARNING MODELS

    公开(公告)号:US20240046097A1

    公开(公告)日:2024-02-08

    申请号:US17817662

    申请日:2022-08-05

    CPC classification number: G06N3/082 G06K9/6228 G06K9/6262

    Abstract: A computer-implemented method for compressing a machine learning model includes converting an input machine learning model into a standard machine learning model. The method further includes converting the standard machine learning model into a plurality of pruned machine learning models, each of the pruned machine learning models converted using a corresponding pruning ratio from a pruning ratio candidate list. The method further includes determining, for each of the pruned machine learning models, a size-to-error ratio. The method further includes selecting, based on the size-to-error ratio of the pruned machine learning models, a first pruning ratio from the pruning ratio candidate list. The method further includes generating a compressed machine learning model by compressing the input machine learning model using the first pruning ratio that is selected. The method further includes deploying the compressed machine learning model for production.

Patent Agency Ranking