Method and system of clustering for multi-dimensional data streams
    1.
    发明授权
    Method and system of clustering for multi-dimensional data streams 失效
    用于多维数据流的聚类方法和系统

    公开(公告)号:US08762393B2

    公开(公告)日:2014-06-24

    申请号:US12564343

    申请日:2009-09-22

    申请人: Wong Suk Lee

    发明人: Wong Suk Lee

    IPC分类号: G06F7/00 G06F17/00 G06F17/30

    摘要: A method for clustering multi-dimensional data streams includes: when data elements are input, determining 1-D subclusters and assigning identifiers to the determined 1-D subclusters; (b) generating a matching set that is a set of identifiers of the 1-D subclusters where each dimensional value of the data elements belongs to the range of the 1-D subclusters of the corresponding dimensions; and (c) determining subclusters by finding a set of frequently co-occurring 1-D subclusters among a set of 1-D subclusters that belong to the generated matching set. With the present invention, the processing time required to find the subclusters can be improved and the performance of the memory is further improved.

    摘要翻译: 一种用于聚类多维数据流的方法包括:当输入数据元素时,确定1-D子集群并向确定的1-D子集群分配标识符; (b)生成匹配集合,其是所述1维子集群的标识符集合,其中所述数据元素的每个维度值属于相应尺寸的1-D子集群的范围; 以及(c)通过在属于所生成的匹配集合的一组1-D子簇中找到一组频繁共生的1-D亚群集来确定子群集。 利用本发明,可以提高查找子集群所需的处理时间,进一步提高存储器的性能。

    Method and apparatus for finding maximal frequent itemsets over data streams
    2.
    发明授权
    Method and apparatus for finding maximal frequent itemsets over data streams 失效
    通过数据流查找最大频繁项集的方法和装置

    公开(公告)号:US08150873B2

    公开(公告)日:2012-04-03

    申请号:US12258645

    申请日:2008-10-27

    申请人: Wong Suk Lee

    发明人: Wong Suk Lee

    IPC分类号: G06F7/00

    CPC分类号: G06F17/30516

    摘要: A method and apparatus to find maximal frequent itemsets over data streams. A prefix tree manages itemsets and appearance frequencies of the itemsets, and each of nodes of the prefix tree has information about an appearance frequency, a maximum lifetime, and a mark indicating whether the corresponding itemset is a maximal frequent itemset. The method includes: receiving transaction Tk generated at a current point in time; updating the information owned by each node corresponding to the itemset of the transaction Tk among the nodes of the prefix tree; adding each node that is not managed in the prefix tree among nodes corresponding to the itemset of the transaction Tk, to the prefix tree and setting the information on the added nodes; and finding maximal frequent itemsets by visiting each node of the prefix tree that has the mark indicating the maximal frequent itemset and checking whether the corresponding itemset is frequent.

    摘要翻译: 一种在数据流上查找最大频繁项集的方法和装置。 前缀树管理项目集的项目集和出现频率,前缀树中的每个节点都有关于出现频率,最大生存时间和指示相应项目集是否是最大频繁项目集的标记的信息。 该方法包括:接收在当前时间点生成的事务Tk; 在前缀树的节点之间更新与每个节点相对应的事务Tk的项目集合所拥有的信息; 将与事务Tk的项集相对应的节点之中未在前缀树中管理的每个节点添加到前缀树并且设置添加的节点上的信息; 并通过访问具有指示最大频繁项集的标记的前缀树的每个节点并检查相应的项集是否频繁来查找最大频繁项集。