Managing defects in a model training pipeline using synthetic data sets associated with defect types

    公开(公告)号:US11580425B2

    公开(公告)日:2023-02-14

    申请号:US16917769

    申请日:2020-06-30

    Abstract: The disclosure herein describes managing defects in a model training pipeline. A synthetic data set is generated that is associated with a defect type and a lifecycle stage of the model training pipeline, and baseline performance metrics associated with the defect type are generated. Based on a code change to the pipeline, a test model is trained using the pipeline and the synthetic data set, and test performance metrics are collected based on the test model and associated with the defect type. Based on comparing the baseline performance metrics and the test performance metrics, a defect of a particular defect type is identified in the pipeline. An indicator of the defect is provided that includes the defect type and the lifecycle stage with which the synthetic data set is associated, whereby a defect correction process is enabled to remedy the defect based on the associated defect type and the lifecycle stage.

Patent Agency Ranking