-
公开(公告)号:US20170193074A1
公开(公告)日:2017-07-06
申请号:US14985302
申请日:2015-12-30
Applicant: Yahoo! Inc.
Inventor: Sainath Vellal , Kostas Tsioutsiouliklis
IPC: G06F17/30
CPC classification number: G06F16/285 , G06F16/2255 , G06F16/24568 , G06F16/951 , G06F16/9566
Abstract: Software generates an article signature for each article in a plurality of articles. The software initializes a clustering algorithm with a plurality of initial clusters that are non-overlapping. A centroid signature is generated for each initial cluster from the article signatures of the articles in the initial cluster. The software performs a succession of alternating merges and splits using the centroid signatures to create a plurality of non-overlapping coherent clusters from the plurality of initial clusters. The software identifies an article that is related to a specific article by mapping the article signature for the specific article to the centroid signature for at least one coherent cluster and comparing that article signature to the article signatures of the articles in the coherent cluster, using at least one similarity measure. The software displays the specific article and the related article in proximity to each other in a content stream.