Title: Grid-based clustering over an evolving data stream

Authors: Renxia Wan, Jingchao Chen, Lixin Wang, Xiaoke Su

Addresses: College of Information Science and Technology, Donghua University, Shanghai 201620, China. ' College of Information Science and Technology, Donghua University, Shanghai 201620, China. ' College of Information Science and Technology, Donghua University, Shanghai 201620, China. ' College of Information Science and Technology, Donghua University, Shanghai 201620, China

Abstract: Clustering on data stream has a great challenge because it has to be implemented within a limited space and a strict time constraint and the data stream may be potentially infinite. Fortunately, many clustering algorithms for data stream have been proposed, these algorithms have greatly promoted the clustering level of data stream, but most of the algorithms are designed for convex clusters. In this paper, a grid-based clustering algorithm is presented, it maps every data into the corresponding grid firstly and then iteratively merges these grids into clusters via merging steps, only boundary grids are considered during the merging stage. The algorithm also can group the evolving data stream into arbitrary shaped clusters. Compared with the same categorical algorithms, it has a less parameters input. In terms of effectivity and efficiency, the proposed algorithm outperforms the same categorical ones from theoretical and experimental analysis.

Keywords: clustering; data stream; grid clique; neighbouring grid; boundary grids; merging; acceptable distance; grid characteristic information; grid computing.

DOI: 10.1504/IJDMMM.2009.029033

International Journal of Data Mining, Modelling and Management, 2009 Vol.1 No.4, pp.393 - 410

Published online: 29 Oct 2009 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article