发明申请
- 专利标题: System and method for continuous diagnosis of data streams
- 专利标题(中): 用于连续诊断数据流的系统和方法
-
申请号: US10880913申请日: 2004-06-30
-
公开(公告)号: US20060010093A1公开(公告)日: 2006-01-12
- 发明人: Wei Fan , Haixun Wang , Philip Yu
- 申请人: Wei Fan , Haixun Wang , Philip Yu
- 申请人地址: US NY Armonk
- 专利权人: IBM Corporation
- 当前专利权人: IBM Corporation
- 当前专利权人地址: US NY Armonk
- 主分类号: G06F17/30
- IPC分类号: G06F17/30
摘要:
In connection with the mining of time-evolving data streams, a general framework that mines changes and reconstructs models from a data stream with unlabeled instances or a limited number of labeled instances. In particular, there are defined herein statistical profiling methods that extend a classification tree in order to guess the percentage of drifts in the data stream without any labelled data. Exact error can be estimated by actively sampling a small number of true labels. If the estimated error is significantly higher than empirical expectations, there preferably re-sampled a small number of true labels to reconstruct the decision tree from the leaf node level.
公开/授权文献
- US07464068B2 System and method for continuous diagnosis of data streams 公开/授权日:2008-12-09
信息查询