-
公开(公告)号:US08949237B2
公开(公告)日:2015-02-03
申请号:US13345593
申请日:2012-01-06
申请人: Maria Florina Balcan , Christian H. Borgs , Mark Braverman , Jennifer T. Chayes , Shanghua Teng
发明人: Maria Florina Balcan , Christian H. Borgs , Mark Braverman , Jennifer T. Chayes , Shanghua Teng
CPC分类号: G06F17/30867 , G06Q10/10 , G06Q50/01
摘要: A technique for identifying overlapping clusters of items in a data set. The technique may be used in connection with a social network or other on-line environment in which users express approval for other users, such as through votes, tags or other inputs. These expressions of approval may be used to form clusters such that entities assigned to a cluster have a higher metric of approval from other entities within the cluster than from outside the cluster. Such clusters may be arrived at through a computationally efficient approach that involves randomly selecting one or more entities as a seed for a cluster. The cluster may be grown by testing other entities, similar to those already in the cluster, to determine whether they are more preferred by those already in the cluster than those outside the cluster. Once a cluster is grown to a desired size, it may be pruned.
摘要翻译: 一种用于识别数据集中项目重叠聚类的技术。 该技术可以与社交网络或其他在线环境一起使用,其中用户通过投票,标签或其他输入来表达对其他用户的批准。 这些批准表达式可用于形成集群,使得分配给集群的实体具有比群集内外的其他实体更高的批准度量。 可以通过涉及随机选择一个或多个实体作为集群的种子的计算有效的方法来获得这样的集群。 可以通过测试类似于集群中的那些其他实体来生成集群,以确定它们是否比集群以外的集群更为优先。 一旦群集生长到所需的大小,它可能被修剪。
-
公开(公告)号:US20130179449A1
公开(公告)日:2013-07-11
申请号:US13345593
申请日:2012-01-06
申请人: Maria Florina Balcan , Christian H. Borgs , Mark Braverman , Jennifer T. Chayes , Shanghua Teng
发明人: Maria Florina Balcan , Christian H. Borgs , Mark Braverman , Jennifer T. Chayes , Shanghua Teng
IPC分类号: G06F17/30
CPC分类号: G06F17/30867 , G06Q10/10 , G06Q50/01
摘要: A technique for identifying overlapping clusters of items in a data set. The technique may be used in connection with a social network or other on-line environment in which users express approval for other users, such as through votes, tags or other inputs. These expressions of approval may be used to form clusters such that entities assigned to a cluster have a higher metric of approval from other entities within the cluster than from outside the cluster. Such clusters may be arrived at through a computationally efficient approach that involves randomly selecting one or more entities as a seed for a cluster. The to cluster may be grown by testing other entities, similar to those already in the cluster, to determine whether they are more preferred by those already in the cluster than those outside the cluster. Once a cluster is grown to a desired size, it may be pruned.
摘要翻译: 一种用于识别数据集中项目重叠聚类的技术。 该技术可以与社交网络或其他在线环境一起使用,其中用户通过投票,标签或其他输入来表达对其他用户的批准。 这些批准表达式可用于形成集群,使得分配给集群的实体具有比群集内外的其他实体更高的批准度量。 可以通过涉及随机选择一个或多个实体作为集群的种子的计算有效的方法来获得这样的集群。 可以通过测试与群集中已经存在的其他实体类似的其他实体来生成集群,以确定它们是否比集群以外的已经在群集中更为优先。 一旦群集生长到所需的大小,它可能被修剪。
-