Title: Variance and density-based anomaly identification and ranking for evolving data streams

Authors: Yogita; Durga Toshniwal

Addresses: Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee – 247667, Uttarakhand, India ' Department of Computer Science and Engineering, Indian Institute of Technology, Roorkee – 247667, Uttarakhand, India

Abstract: Data stream mining is emerging as an important research area in the recent times. This is due the fact that streaming data gets generated from many applications. Stream mining is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving in the data. Finding anomalies from data streams is of great significance in many applications. Most of the existing anomaly detection techniques suffer from many limitations. They are commonly applicable to static data of uniform densities. But the majority of real world data is of varying densities. In the present work, we propose a method which identifies anomalies based on variance and data density and also assigns ranks to anomalies. Initially, variance-based clustering is done to find the candidate anomalies. After finding the candidate anomalies, we assign ranks to them based on the densities of the clusters. Furthermore, to reduce the effect of irrelevant (noisy) attributes during anomaly detection, the proposed method assigns weights to attributes depending upon their respective relevance. Keeping in view the challenges of streaming data, the proposed method is incremental and adaptive to evolution of new concept in the data. Experimental results on both synthetic and real world datasets show that the proposed method outperforms other existing methods.

Keywords: anomaly detection; evolving data streams; varying density datasets; variance-based clustering; data stream mining; anomaly ranking; data density; streaming data.

DOI: 10.1504/IJCISTUDIES.2014.062734

International Journal of Computational Intelligence Studies, 2014 Vol.3 No.2/3, pp.251 - 274

Received: 24 Jan 2013
Accepted: 06 Aug 2013

Published online: 28 Jun 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article