Smart de-identification using date jittering
摘要:
System and method to produce an anonymized cohort having less than a predetermined risk of re-identification. The method includes receiving a data query of requested traits for the anonymized cohort, querying a data source to find records that possess at least some of the traits, forming a dataset from at least some of the records, and grouping the dataset in time into a first boundary group, a second boundary group, and one or more non-boundary groups temporally between the first boundary group and second boundary group. For each non-boundary group, calculating maximum time limits the non-boundary group can be time-shifted without overlapping an adjacent group, calculating a group jitter amount, capping the group jitter amount by the maximum time limits and by respective predetermined jitter limits, and jittering said non-boundary group by the capped group jitter amount to produce an anonymized dataset. Return the anonymized dataset.
信息查询
0/0