-
公开(公告)号:US20220172101A1
公开(公告)日:2022-06-02
申请号:US17106029
申请日:2020-11-27
Applicant: Amazon Technologies, Inc.
Inventor: Sanjiv Das , Michele Donini , Jason Lawrence Gelman , Kevin Haas , Tyler Stephen Hill , Krishnaram Kenthapadi , Pinar Altin Yilmaz , Muhammad Bilal Zafar , Pedro L Larroy
Abstract: Feature attribution may be captured as part of a machine learning pipeline. A training job may include a request to determine feature attribution as part of a machine learning pipeline that trains a machine learning model from a training data set. A reference data set for determining the feature attribution of the machine learning model may be identified. The feature attribution may be determined based on the reference data set. The feature attribution of the trained machine learning model may be stored.
-
公开(公告)号:US12080056B1
公开(公告)日:2024-09-03
申请号:US17535909
申请日:2021-11-26
Applicant: Amazon Technologies, Inc.
Inventor: Ashish Rajendra Rathi , Michele Donini , Tyler Stephen Hill , Krishnaram Kenthapadi , Xinyu Liu , Pinar Altin Yilmaz , Muhammad Bilal Zafar
IPC: G06V10/70 , G06T7/10 , G06V10/764 , G06V10/77 , G06V20/50
CPC classification number: G06V10/87 , G06T7/10 , G06V10/764 , G06V10/768 , G06V10/7715 , G06V20/50
Abstract: Explanation jobs may be performed for computer vision tasks. A request to execute an explanation job for a computer vision machine learning model may be received. The execution job may be performed, including extracting different features from the image, determining the respective relative importance values of the different features on inferences generated by the computer vision machine learning model. The result of the explanation job, including the generated heat maps may be provided.
-
公开(公告)号:US11487765B1
公开(公告)日:2022-11-01
申请号:US17360981
申请日:2021-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Sergul Aydore , William Brown , Michael Kearns , Krishnaram Kenthapadi , Luca Melis , Aaron Roth , Amaresh Ankit Siva
IPC: G06F16/00 , G06F16/2455 , G06F16/2458
Abstract: An algorithm releases answers to very large numbers of statistical queries, e.g., k-way marginals, subject to differential privacy. The algorithm answers queries on a private dataset using simple perturbation, and then attempts to find a synthetic dataset that most closely matches the noisy answers. The algorithm uses a continuous relaxation of the synthetic dataset domain which makes the projection loss differentiable, and allows the use of efficient machine learning optimization techniques and tooling. Rather than answering all queries up front, the algorithm makes judicious use of a privacy budget by iteratively and adaptively finding queries for which relaxed synthetic data has high error, and then repeating the projection. The algorithm is effective across a range of parameters and datasets, especially when a privacy budget is small or a query class is large.
-
公开(公告)号:US20220172004A1
公开(公告)日:2022-06-02
申请号:US17106027
申请日:2020-11-27
Applicant: Amazon Technologies, Inc.
Inventor: Sanjiv Das , Michele Donini , Jason Lawrence Gelman , Kevin Haas , Tyler Stephen Hill , Krishnaram Kenthapadi , Pinar Altin Yilmaz , Muhammad Bilal Zafar , Pedro L. Larroy
Abstract: Bias metrics and feature attribution may be monitored for a machine learning model. A request to enable monitoring for bias metrics or feature attribution may be received. Monitoring may be enabled to evaluate respective performance of inferences of a machine learning model according to the enabled bias metrics or feature attribution. If a divergence from reference data is detected, then a notification indicating the divergence may be sent.
-
5.
公开(公告)号:US20240232526A1
公开(公告)日:2024-07-11
申请号:US18610140
申请日:2024-03-19
Applicant: Amazon Technologies, Inc.
Inventor: Cedric Philippe Archambeau , Sanjiv Ranjan Das , Michele Donini , Michaela Hardt , Tyler Stephen Hill , Krishnaram Kenthapadi , Pedro L Larroy , Xinyu Liu , Keerthan Harish Vasist , Pinar Altin Yilmaz , Muhammad Bilal Zafar
CPC classification number: G06F40/20 , G06F16/2246 , G06N5/01
Abstract: A determination is made that an explanatory data set for a common set of predictions generated by a machine learning model for records containing text tokens is to be provided. Respective groups of related tokens are identified from the text attributes of the records, and record-level prediction influence scores are generated for the token groups. An aggregate prediction influence score is generated for at least some of the token groups from the record-level scores, and an explanatory data set based on the aggregate scores is presented.
-
6.
公开(公告)号:US11977836B1
公开(公告)日:2024-05-07
申请号:US17535945
申请日:2021-11-26
Applicant: Amazon Technologies, Inc.
Inventor: Cedric Philippe Archambeau , Sanjiv Ranjan Das , Michele Donini , Michaela Hardt , Tyler Stephen Hill , Krishnaram Kenthapadi , Pedro L Larroy , Xinyu Liu , Keerthan Harish Vasist , Pinar Altin Yilmaz , Muhammad Bilal Zafar
CPC classification number: G06F40/20 , G06F16/2246 , G06N5/01
Abstract: A determination is made that an explanatory data set for a common set of predictions generated by a machine learning model for records containing text tokens is to be provided. Respective groups of related tokens are identified from the text attributes of the records, and record-level prediction influence scores are generated for the token groups. An aggregate prediction influence score is generated for at least some of the token groups from the record-level scores, and an explanatory data set based on the aggregate scores is presented.
-
公开(公告)号:US20220172099A1
公开(公告)日:2022-06-02
申请号:US17106013
申请日:2020-11-27
Applicant: Amazon Technologies, Inc.
Inventor: Sanjiv Das , Michele Donini , Jason Lawrence Gelman , Kevin Haas , Tyler Stephen Hill , Krishnaram Kenthapadi , Pinar Altin Yilmaz , Muhammad Bilal Zafar , Pedro L Larroy
Abstract: Bias metrics may be captured at different stages for training a machine learning model. A training job may specify bias metrics to capture at multiple different stages of a machine learning pipeline for a feature of a training data set used to train a machine learning model. The training job may be executed and the bias metrics determined at the stages as specified in the training job. The bias metrics for the different stages may be stored.
-
8.
公开(公告)号:US20220171991A1
公开(公告)日:2022-06-02
申请号:US17106021
申请日:2020-11-27
Applicant: Amazon Technologies, Inc.
Inventor: Sanjiv Das , Michele Donini , Jason Lawrence Gelman , Kevin Haas , Tyler Stephen Hill , Krishnaram Kenthapadi , Pinar Altin Yilmaz , Muhammad Bilal Zafar , Pedro L Larroy
Abstract: Views may be generated for bias metrics or feature attribution captured in machine learning pipelines. A request to create a view of bias metrics or feature attribution may be received. The bias metrics or feature attribution may have been determined in a machine learning pipeline as part of executing a training job that specified the bias metrics or the feature attribution. A development application may access a data store that stores the bias metrics or the feature attribution determined in the machine learning pipeline. A view based on the bias metrics or feature attribution may be generated and provided.
-
公开(公告)号:US11841863B1
公开(公告)日:2023-12-12
申请号:US17954260
申请日:2022-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Sergul Aydore , William Brown , Michael Kearns , Krishnaram Kenthapadi , Luca Melis , Aaron Roth , Amaresh Ankit Siva
IPC: G06F16/00 , G06F16/2455 , G06F16/2458
CPC classification number: G06F16/24568 , G06F16/2462
Abstract: An algorithm releases answers to very large numbers of statistical queries, e.g., k-way marginals, subject to differential privacy. The algorithm answers queries on a private dataset using simple perturbation, and then attempts to find a synthetic dataset that most closely matches the noisy answers. The algorithm uses a continuous relaxation of the synthetic dataset domain which makes the projection loss differentiable, and allows the use of efficient machine learning optimization techniques and tooling. Rather than answering all queries up front, the algorithm makes judicious use of a privacy budget by iteratively and adaptively finding queries for which relaxed synthetic data has high error, and then repeating the projection. The algorithm is effective across a range of parameters and datasets, especially when a privacy budget is small or a query class is large.
-
公开(公告)号:US11481659B1
公开(公告)日:2022-10-25
申请号:US16917757
申请日:2020-06-30
Applicant: Amazon Technologies, Inc.
Inventor: Valerio Perrone , Michele Donini , Krishnaram Kenthapadi , Cedric Philippe Archambeau
Abstract: Hyperparameters for tuning a machine learning system may be optimized for fairness using Bayesian optimization with constraints for accuracy and bias. Hyperparameter optimization may be performed for a received training set and received accuracy and fairness constraints. Respective probabilistic models for accuracy and bias of the machine learning system may be initialized, then hyperparameter optimization may include iteratively identifying respective values for hyperparameters using analysis of the respective models performed using an acquisition function implementing constrained expected improvement on the respective models, training the machine learning system using the identified values to determine measures of accuracy and bias, and updating the respective models using the determined measures.
-
-
-
-
-
-
-
-
-