-
公开(公告)号:US20250077709A1
公开(公告)日:2025-03-06
申请号:US18955530
申请日:2024-11-21
Applicant: Google LLC
Inventor: Alessandro Epasto , Hossein Esfandiari , Vahab Seyed Mirrokni , Andres Munoz Medina , Umar Syed , Sergei Vassilvitskii
Abstract: A computer-implemented method for k-anonymizing a dataset to provide privacy guarantees for all columns in the dataset can include obtaining, by a computing system including one or more computing devices, a dataset comprising data indicative of a plurality of entities and at least one data item respective to at least one of the plurality of entities. The computer-implemented method can include clustering, by the computing system, the plurality of entities into at least one entity cluster. The computer-implemented method can include determining, by the computing system, a majority condition for the at least one entity cluster, the majority condition indicating that the at least one data item is respective to at least a majority of the plurality of entities. The computer-implemented method can include assigning, by the computing system, the at least one data item to the plurality of entities in an anonymized dataset based at least in part on the majority condition.
-
公开(公告)号:US12164673B2
公开(公告)日:2024-12-10
申请号:US18345657
申请日:2023-06-30
Applicant: Google LLC
Inventor: Alessandro Epasto , Hossein Esfandiari , Vahab Seyed Mirrokni , Andres Munoz Medina , Umar Syed , Sergei Vassilvitskii
Abstract: A computer-implemented method for k-anonymizing a dataset to provide privacy guarantees for all columns in the dataset can include obtaining, by a computing system including one or more computing devices, a dataset comprising data indicative of a plurality of entities and at least one data item respective to at least one of the plurality of entities. The computer-implemented method can include clustering, by the computing system, the plurality of entities into at least one entity cluster. The computer-implemented method can include determining, by the computing system, a majority condition for the at least one entity cluster, the majority condition indicating that the at least one data item is respective to at least a majority of the plurality of entities. The computer-implemented method can include assigning, by the computing system, the at least one data item to the plurality of entities in an anonymized dataset based at least in part on the majority condition.
-
公开(公告)号:US11238357B2
公开(公告)日:2022-02-01
申请号:US16042975
申请日:2018-07-23
Applicant: Google LLC
IPC: G06F7/00 , G06F16/34 , G06F16/23 , G06F16/24 , G06N7/00 , H04L9/06 , G06F17/10 , G06F9/448 , G06N20/00 , G06F8/30 , H04L9/32 , G06N5/02 , G06N5/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing large datasets using a computationally-efficient representation are disclosed. A request to apply a coverage algorithm to a large input dataset is received. The large dataset includes sets of elements. A computationally-efficient representation of the large dataset is generated by generating a reduced set of elements that contains fewer elements based on a defined probability. For each element in the reduced set, a determination is made regarding whether the element appears in more than a threshold number of sets. When the element appears in more than the threshold number, the element is removed from sets until the element appears in only the threshold number. The coverage algorithm is then applied to the computationally-efficient representation to identify a subset of the sets. The system provides data identifying the subset of the sets in response to the received request.
-
公开(公告)号:US11727147B2
公开(公告)日:2023-08-15
申请号:US17016788
申请日:2020-09-10
Applicant: Google LLC
Inventor: Alessandro Epasto , Hossein Esfandiari , Vahab Seyed Mirrokni , Andres Munoz Medina , Umar Syed , Sergei Vassilvitskii
CPC classification number: G06F21/6254 , G06F16/285 , G06N20/00
Abstract: A computer-implemented method for k-anonymizing a dataset to provide privacy guarantees for all columns in the dataset can include obtaining, by a computing system including one or more computing devices, a dataset comprising data indicative of a plurality of entities and at least one data item respective to at least one of the plurality of entities. The computer-implemented method can include clustering, by the computing system, the plurality of entities into at least one entity cluster. The computer-implemented method can include determining, by the computing system, a majority condition for the at least one entity cluster, the majority condition indicating that the at least one data item is respective to at least a majority of the plurality of entities. The computer-implemented method can include assigning, by the computing system, the at least one data item to the plurality of entities in an anonymized dataset based at least in part on the majority condition.
-
公开(公告)号:US11574067B2
公开(公告)日:2023-02-07
申请号:US16774380
申请日:2020-01-28
Applicant: Google LLC
Inventor: Alessandro Epasto , Vahab Seyed Mirrokni , Hossein Esfandiari
Abstract: Example systems and methods enhance user privacy by performing efficient on-device public-private computation on a combination of public and private data, such as, for example, public and private graph data. In particular, the on-device public-private computation framework described herein can enable a device associated with an entity to efficiently compute a combined output that takes into account and is explicitly based upon a combination of data that is associated with the entity and data that is associated with one or more other entities that are private connections of the entity, all without revealing to a centralized computing system a set of locally stored private data that identifies the one or more other entities that are private connections of the entity.
-
公开(公告)号:US20220075897A1
公开(公告)日:2022-03-10
申请号:US17016788
申请日:2020-09-10
Applicant: Google LLC
Inventor: Alessandro Epasto , Hossein Esfandiari , Vahab Seyed Mirrokni , Andres Munoz Medina , Umar Syed , Sergei Vassilvitskii
Abstract: A computer-implemented method for k-anonymizing a dataset to provide privacy guarantees for all columns in the dataset can include obtaining, by a computing system including one or more computing devices, a dataset comprising data indicative of a plurality of entities and at least one data item respective to at least one of the plurality of entities. The computer-implemented method can include clustering, by the computing system, the plurality of entities into at least one entity cluster. The computer-implemented method can include determining, by the computing system, a majority condition for the at least one entity cluster, the majority condition indicating that the at least one data item is respective to at least a majority of the plurality of entities. The computer-implemented method can include assigning, by the computing system, the at least one data item to the plurality of entities in an anonymized dataset based at least in part on the majority condition.
-
公开(公告)号:US20200242268A1
公开(公告)日:2020-07-30
申请号:US16774380
申请日:2020-01-28
Applicant: Google LLC
Inventor: Alessandro Epasto , Vahab Seyed Mirrokni , Hossein Esfandiari
Abstract: Example systems and methods enhance user privacy by performing efficient on-device public-private computation on a combination of public and private data, such as, for example, public and private graph data. In particular, the on-device public-private computation framework described herein can enable a device associated with an entity to efficiently compute a combined output that takes into account and is explicitly based upon a combination of data that is associated with the entity and data that is associated with one or more other entities that are private connections of the entity, all without revealing to a centralized computing system a set of locally stored private data that identifies the one or more other entities that are private connections of the entity.
-
公开(公告)号:US20230359769A1
公开(公告)日:2023-11-09
申请号:US18345657
申请日:2023-06-30
Applicant: Google LLC
Inventor: Alessandro Epasto , Hossein Esfandiari , Vahab Seyed Mirrokni , Andres Munoz Medina , Umar Syed , Sergei Vassilvitskii
CPC classification number: G06F21/6254 , G06N20/00 , G06F16/285
Abstract: A computer-implemented method for k-anonymizing a dataset to provide privacy guarantees for all columns in the dataset can include obtaining, by a computing system including one or more computing devices, a dataset comprising data indicative of a plurality of entities and at least one data item respective to at least one of the plurality of entities. The computer-implemented method can include clustering, by the computing system, the plurality of entities into at least one entity cluster. The computer-implemented method can include determining, by the computing system, a majority condition for the at least one entity cluster, the majority condition indicating that the at least one data item is respective to at least a majority of the plurality of entities. The computer-implemented method can include assigning, by the computing system, the at least one data item to the plurality of entities in an anonymized dataset based at least in part on the majority condition.
-
公开(公告)号:US20190026640A1
公开(公告)日:2019-01-24
申请号:US16042975
申请日:2018-07-23
Applicant: Google LLC
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing large datasets using a computationally-efficient representation are disclosed. A request to apply a coverage algorithm to a large input dataset is received. The large dataset includes sets of elements. A computationally-efficient representation of the large dataset is generated by generating a reduced set of elements that contains fewer elements based on a defined probability. For each element in the reduced set, a determination is made regarding whether the element appears in more than a threshold number of sets. When the element appears in more than the threshold number, the element is removed from sets until the element appears in only the threshold number. The coverage algorithm is then applied to the computationally-efficient representation to identify a subset of the sets. The system provides data identifying the subset of the sets in response to the received request.
-
-
-
-
-
-
-
-