- 专利标题: Systems and methods for data deduplication by generating similarity metrics using sketch computation
-
申请号: US16718714申请日: 2019-12-18
-
公开(公告)号: US10938961B1公开(公告)日: 2021-03-02
- 发明人: Santhosh Rahul Ponnala , Tarang Vaish
- 申请人: Ndata, Inc.
- 申请人地址: US CA Mountain View
- 专利权人: Ndata, Inc.
- 当前专利权人: Ndata, Inc.
- 当前专利权人地址: US CA Mountain View
- 代理机构: Wilson Sonsini Goodrich & Rosati
- 主分类号: H04L29/06
- IPC分类号: H04L29/06 ; G06F16/174
摘要:
A method for data reduction may comprise computing (i) a first sketch of a first segment and (ii) a second sketch of a second segment. The first sketch and the second sketch may each comprise a set of features that are representative of or unique to the corresponding first and second segments. The method also comprise processing the first sketch and the second sketch to generate a similarity metric indicative of whether the second segment is similar to the first segment. The method may further comprise (1) performing a differencing operation on the second segment relative to the first segment when the similarity metric is greater than or equal to a similarity threshold, or (2) storing the first segment and the second segment in a database without performing the differencing operation when the similarity metric is less than the similarity threshold.
信息查询