-
公开(公告)号:US20190303387A1
公开(公告)日:2019-10-03
申请号:US16370952
申请日:2019-03-30
Applicant: Avast Software s.r.o.
Inventor: Martin Smarda , Pavel Srámek
Abstract: Systems and methods capable of initializing centroids in large datasets before commencement of clustering operations. The systems and methods can utilize a random sampling window to increase the speed of centroid initialization. The systems and methods can be modified to leverage parallelism and be configured for execution on multi-node compute clusters. Optionally, the initialization systems and methods can include post-initialization centroid discarding and/or re-assignment operations that adaptively control cluster sizes.