Method and system for calculating minwise hash signatures from weighted sets
摘要:
A system and method for the creation of locality sensitive hash signatures using weighted feature sets is disclosed. The disclosed methodology takes advantage of discretization mechanisms commonly used in computer systems to model the influence of the feature weights on the calculated hash signature. Pseudo random numbers required for the signature calculation are created in ascending order, which enables the signature generation mechanism to identify and avoid the unnecessary creation of pseudo random numbers to improve the performance of the signature calculation process. Further, hierarchic, tree-search like algorithms are used during the processing of signature weights to further decrease the number of required random numbers. The features of the Poisson Process model, like its ability to provide random numbers in ascending order and the split—and mergeability of Poisson Processes are used to further improve the performance of the signature calculation process.
信息查询
0/0