Title: Outlier detection over data streams: survey

Authors: Zaki Brahmi; Imen Souiden

Addresses: Computer Science Department, College of Science and Arts at Al-Ola, University of Taibah, Medina, KSA ' Higher Institute of Computer Science and Management of Kairouan, Kaioruan University, Tunisia

Abstract: Outlier detection is regarded as one of the most important applications of data mining widely applied in various application fields, such as healthcare, telecommunication, etc. It contributes to the improvement of the data analysis, the avoidance of bad results and the prevention of possible threats. In many scenarios, the data to treat are in the form of streams, which has different characteristics than the static data such as uncertainty, multidimensionality, dynamic distribution, transiency, and dynamic relationship. This makes outlier detection a more challenging problem. Therefore, traditional data mining techniques cannot be used and then suitable techniques to the nature of the data streams must be applied. This paper discusses the key issues, major challenges, and the existing most frequently used methods for detecting outliers in the context of data stream mining. The studied approaches are being compared theoretically and experimentally, are based on a set of criteria. In the experimental study we carry out a comparison between two most used algorithms for data streams (AnyOut and MCOD) we use a framework that predicts the abnormal cloud server behaviours by detecting the CPU and memory abnormal cloud users' requests. The results revealed that MCOD outperforms AnyOut in the different parameters settings.

Keywords: outlier detection? data stream mining? cloud computing.

DOI: 10.1504/IJBIDM.2021.118949

International Journal of Business Intelligence and Data Mining, 2021 Vol.19 No.4, pp.481 - 507

Received: 31 Oct 2019
Accepted: 01 Dec 2019

Published online: 12 Nov 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article