-
公开(公告)号:US11487765B1
公开(公告)日:2022-11-01
申请号:US17360981
申请日:2021-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Sergul Aydore , William Brown , Michael Kearns , Krishnaram Kenthapadi , Luca Melis , Aaron Roth , Amaresh Ankit Siva
IPC: G06F16/00 , G06F16/2455 , G06F16/2458
Abstract: An algorithm releases answers to very large numbers of statistical queries, e.g., k-way marginals, subject to differential privacy. The algorithm answers queries on a private dataset using simple perturbation, and then attempts to find a synthetic dataset that most closely matches the noisy answers. The algorithm uses a continuous relaxation of the synthetic dataset domain which makes the projection loss differentiable, and allows the use of efficient machine learning optimization techniques and tooling. Rather than answering all queries up front, the algorithm makes judicious use of a privacy budget by iteratively and adaptively finding queries for which relaxed synthetic data has high error, and then repeating the projection. The algorithm is effective across a range of parameters and datasets, especially when a privacy budget is small or a query class is large.
-
公开(公告)号:US11841863B1
公开(公告)日:2023-12-12
申请号:US17954260
申请日:2022-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Sergul Aydore , William Brown , Michael Kearns , Krishnaram Kenthapadi , Luca Melis , Aaron Roth , Amaresh Ankit Siva
IPC: G06F16/00 , G06F16/2455 , G06F16/2458
CPC classification number: G06F16/24568 , G06F16/2462
Abstract: An algorithm releases answers to very large numbers of statistical queries, e.g., k-way marginals, subject to differential privacy. The algorithm answers queries on a private dataset using simple perturbation, and then attempts to find a synthetic dataset that most closely matches the noisy answers. The algorithm uses a continuous relaxation of the synthetic dataset domain which makes the projection loss differentiable, and allows the use of efficient machine learning optimization techniques and tooling. Rather than answering all queries up front, the algorithm makes judicious use of a privacy budget by iteratively and adaptively finding queries for which relaxed synthetic data has high error, and then repeating the projection. The algorithm is effective across a range of parameters and datasets, especially when a privacy budget is small or a query class is large.
-