SPECIALIZED, DATA-FREE MODEL QUANTIZATION

    公开(公告)号:US20250054276A1

    公开(公告)日:2025-02-13

    申请号:US18231466

    申请日:2023-08-08

    Abstract: In one implementation, a device obtains a base machine learning model trained to label input data using a plurality of classes. The device receives a deployment task from a user interface indicative of a subset of one or more of the plurality of classes to be identified by a new model for deployment. The device selects a quantization level based on a difficulty associated with the deployment task. The device generates the new model for deployment that is quantized from the base machine learning model and specialized to label its input data using only the subset of one or more of the plurality of classes.

    DYNAMIC COMPRESSION AND SPECIALIZATION OF A MACHINE LEARNING MODEL

    公开(公告)号:US20250036933A1

    公开(公告)日:2025-01-30

    申请号:US18225371

    申请日:2023-07-24

    Abstract: In one embodiment, a device identifies a plurality of tasks that a base machine learning model is able to perform. The device receives, via a user interface, a request to generate a specialized model to perform a particular task for deployment to a target deployment environment. The device uses knowledge distillation on the base machine learning model to train the specialized model to perform the particular task based on at least one of the plurality of tasks. The device causes the specialized model to be deployed to the target deployment environment.

Patent Agency Ranking