High perforamance machine learning inference framework for edge devices

    公开(公告)号:US11301762B1

    公开(公告)日:2022-04-12

    申请号:US16179217

    申请日:2018-11-02

    Abstract: Techniques for high-performance machine learning (ML) inference in heterogenous edge devices are described. A ML model trained using a variety of different frameworks is translated into a common format that is runnable by inferences engines of edge devices. The translated model is optimized in hardware-agnostic and/or hardware-specific ways to improve inference performance, and the optimized model is sent to the edge devices. The inference engine for any edge device can be accessed by a customer application using a same defined API, regardless of the hardware characteristics of the edge device or the original format of the ML model.

    Knowledge distillation and automatic model retraining via edge device sample collection

    公开(公告)号:US10990850B1

    公开(公告)日:2021-04-27

    申请号:US16217400

    申请日:2018-12-12

    Abstract: Techniques for machine learning (ML) model knowledge distillation and automatic retraining are described. A model adaptation controller obtains samples generated by an edge device and inference values generated based on the samples by a deployed ML model of the edge device. The model adaptation controller runs inference on the samples using a different ML model to generate inferences that can be used to determine whether the performance of the deployed ML model is lacking. If so, the model adaptation controller can retrain the deployed ML model using samples with ground truth values generated by the different ML model, resulting in a light-weight retrained model that can be provisioned to the edge device. This retraining process may be performed iteratively to automatically improve and adapt the ML model running at the edge device.

    High performance machine learning inference framework for edge devices

    公开(公告)号:US11704577B1

    公开(公告)日:2023-07-18

    申请号:US17716945

    申请日:2022-04-08

    CPC classification number: G06N5/027 G06F16/116 G06N20/00

    Abstract: Techniques for high-performance machine learning (ML) inference in heterogenous edge devices are described. A ML model trained using a variety of different frameworks is translated into a common format that is runnable by inferences engines of edge devices. The translated model is optimized in hardware-agnostic and/or hardware-specific ways to improve inference performance, and the optimized model is sent to the edge devices. The inference engine for any edge device can be accessed by a customer application using a same defined API, regardless of the hardware characteristics of the edge device or the original format of the ML model.

Patent Agency Ranking