Authors: Wei Fan, Toyohide Watanabe, Koichi Asakura
Addresses: Department of Systems and Social Informatics, Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan. ' Department of Systems and Social Informatics, Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan. ' School of Informatics, Daido Institute of Technology, 10-3, Takiharu-cho, Minami-ku, Nagoya 457-8530, Japan
Abstract: In this paper, we propose a framework supporting clustering over different portions of continuous data streams at all possible time points. The framework is divided into two phases. Online statistics maintenance phase provides an approximation method for online statistics collection and a compact multi-resolution hierarchy for statistics maintenance. Once a clustering request is submitted, offline clustering phase abstracts statistics for approximating the user desired subsequences as precisely as possible from statistics hierarchies, and outputs the results of clustering over these statistics. Our performance experiments over real and synthetic data sets illustrate the effectiveness, efficiency of our approach.
Keywords: data stream mining; flexible clustering; multiple data streams; one data scan; summarisation statistics hierarchy; adaptive abstraction; subsequences.
International Journal of Advanced Intelligence Paradigms, 2008 Vol.1 No.2, pp.178 - 195
Published online: 30 Apr 2009 *Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article