Title: Hybridisation of classifiers for anomaly detection in big data

Authors: Rasim M. Alguliyev; Ramiz M. Aliguliyev; Fargana Jabbar Abdullayeva

Addresses: Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141, Baku, Azerbaijan ' Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141, Baku, Azerbaijan ' Institute of Information Technology, Azerbaijan National Academy of Sciences, 9A, B. Vahabzade Street, AZ1141, Baku, Azerbaijan

Abstract: Recently, the widespread use of cloud technologies has led to the rapid increase in the scale and complexity of this infrastructure. The degradation and downtimes in the performance metrics of these large-scale systems are considered to be a major problem. The key issue in addressing these problems is to detect anomalies that can occur in hardware, software and state of the systems of cloud infrastructure. In this paper, for the detection of anomalies in performance metrics of cloud infrastructure, a semi-supervised classification method based on an ensemble of classifiers is proposed. In the proposed method, to build ensemble Naive Bayes, J48, SMO, multilayer perceptron, IBK and PART algorithms are used. To detect anomalous behaviour on the performance metrics the public data of the Google and Yahoo! companies, Python 2.7, MATLAB, Weka and Google Cloud SDK Shell applications are used. In the result of the experimental study of the model, 90% detection accuracy is obtained.

Keywords: anomaly; performance metrics; big data; CPU-usage; memory usage; naive Bayes; J48 decision tree; semi-supervised algorithms; classifiers ensemble; Google cluster trace.

DOI: 10.1504/IJBDI.2019.097396

International Journal of Big Data Intelligence, 2019 Vol.6 No.1, pp.11 - 19

Received: 08 Aug 2017
Accepted: 04 Dec 2017

Published online: 21 Jan 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article