-
公开(公告)号:US11841863B1
公开(公告)日:2023-12-12
申请号:US17954260
申请日:2022-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Sergul Aydore , William Brown , Michael Kearns , Krishnaram Kenthapadi , Luca Melis , Aaron Roth , Amaresh Ankit Siva
IPC: G06F16/00 , G06F16/2455 , G06F16/2458
CPC classification number: G06F16/24568 , G06F16/2462
Abstract: An algorithm releases answers to very large numbers of statistical queries, e.g., k-way marginals, subject to differential privacy. The algorithm answers queries on a private dataset using simple perturbation, and then attempts to find a synthetic dataset that most closely matches the noisy answers. The algorithm uses a continuous relaxation of the synthetic dataset domain which makes the projection loss differentiable, and allows the use of efficient machine learning optimization techniques and tooling. Rather than answering all queries up front, the algorithm makes judicious use of a privacy budget by iteratively and adaptively finding queries for which relaxed synthetic data has high error, and then repeating the projection. The algorithm is effective across a range of parameters and datasets, especially when a privacy budget is small or a query class is large.
-
公开(公告)号:US11374952B1
公开(公告)日:2022-06-28
申请号:US16586147
申请日:2019-09-27
Applicant: Amazon Technologies, Inc.
Inventor: Baris Coskun , Wei Ding , Luca Melis
Abstract: Techniques for monitoring a computing environment for anomalous activity are presented. An example method includes receiving a request to invoke an action within a computing environment, with the request including a plurality of request attributes and a plurality of contextual attributes. A normalcy score is generated for the received request by encoding the received request into a code in latent space of an autoencoder, reconstructing the request from the code, and generating a probability distribution indicating a likelihood that the reconstructed request attributes exist in a data set of non-anomalous activity. Based on the calculated normalcy score, one or more actions are taken to process the request such that execution of non-anomalous requests is allowed, and execution of potentially anomalous requests may be blocked pending confirmation.
-
公开(公告)号:US11537902B1
公开(公告)日:2022-12-27
申请号:US16912527
申请日:2020-06-25
Applicant: Amazon Technologies, Inc.
Inventor: Sergul Aydore , Baris Coskun , Luca Melis
IPC: G06F16/245 , G06N3/08 , G06N3/04
Abstract: Systems, devices, and methods are provided for detecting anomalous events from categorical data using autoencoders. A system may receive a data set associated with actions requested within the computing environment, wherein the data set includes first categorical data indicative of anomalous activity in the computing environment. The system may train an autoencoder to reconstruct approximations of requests associated with the computing environment based on the received data set, wherein training the autoencoder includes using a beta divergence and a maximum mean discrepancy divergence. The trained system may receive a request to invoke an action within the computing environment, may generate a reconstruction of the request to invoke the action using the trained autoencoder, may determine a normalcy score based on a probability that the reconstruction of the request exists in the training data set, and, based on the calculated normalcy score, may determine whether requests indicate anomalous data.
-
公开(公告)号:US11487765B1
公开(公告)日:2022-11-01
申请号:US17360981
申请日:2021-06-28
Applicant: Amazon Technologies, Inc.
Inventor: Sergul Aydore , William Brown , Michael Kearns , Krishnaram Kenthapadi , Luca Melis , Aaron Roth , Amaresh Ankit Siva
IPC: G06F16/00 , G06F16/2455 , G06F16/2458
Abstract: An algorithm releases answers to very large numbers of statistical queries, e.g., k-way marginals, subject to differential privacy. The algorithm answers queries on a private dataset using simple perturbation, and then attempts to find a synthetic dataset that most closely matches the noisy answers. The algorithm uses a continuous relaxation of the synthetic dataset domain which makes the projection loss differentiable, and allows the use of efficient machine learning optimization techniques and tooling. Rather than answering all queries up front, the algorithm makes judicious use of a privacy budget by iteratively and adaptively finding queries for which relaxed synthetic data has high error, and then repeating the projection. The algorithm is effective across a range of parameters and datasets, especially when a privacy budget is small or a query class is large.
-
-
-