METHOD, ELECTRONIC DEVICE, AND STORAGE MEDIUM FOR DISTILLING MODEL

    公开(公告)号:US20210383233A1

    公开(公告)日:2021-12-09

    申请号:US17101748

    申请日:2020-11-23

    Abstract: The disclosure discloses a method for distilling a model, an electronic device, and a storage medium, and relates to the field of deep learning technologies. A teacher model and a student model are obtained. The second intermediate fully connected layer is transformed into an enlarged fully connected layer and a reduced fully connected layer based on a first data processing capacity of a first intermediate fully connected layer of the teacher model and a second data processing capacity of a second intermediate fully connected layer of the student model. The second intermediate fully connected layer is replaced with the enlarged fully connected layer and the reduced fully connected layer to generate a training student model. The training student model is distilled based on the teacher model.

    METHOD AND APPARATUS FOR IMPROVING MODEL BASED ON PRE-TRAINED SEMANTIC MODEL

    公开(公告)号:US20210397794A1

    公开(公告)日:2021-12-23

    申请号:US17249718

    申请日:2021-03-10

    Abstract: Embodiments of a method and an apparatus for improving a model based on a pre-trained semantic model are provided. The method may include: based on the pre-trained semantic model, obtaining an initial improved model, where semantic result information of an input vector is determined in the initial improved model based on a hash search method; and based on a model distillation method, training the initial improved model to obtain an improved model. Some embodiments can obtain the semantic result information of the input vector by performing the hash search method on the input vector, replace the original complex iterative calculation process of a semantic model, and obtain the improved model with few model parameters and high compression ratio.

Patent Agency Ranking