SCHEDULING ML SERVICES AND MODELS WITH HETEROGENEOUS RESOURCES

    公开(公告)号:US20240185098A1

    公开(公告)日:2024-06-06

    申请号:US17782616

    申请日:2022-04-15

    CPC classification number: G06N5/04

    Abstract: A system determines a timing matrix corresponding to inference times taken for a number of machine learning (ML) models to be executed by a number of processing resources of a computing device. The processing resources includes at least a first and a second type of processing resources. The system applies a service-specific model-first scheduling scheme or a service-specific hardware-first scheduling scheme to obtain corresponding service-specific mappings. The system determines a best mapping from the corresponding service-specific mappings. The system schedules each of the ML models to a corresponding processing resource from the processing resources according to the best mapping. The system executes the ML models using corresponding mapped processing resources.

Patent Agency Ranking