Title: CICIDS2017 dataset: performance improvements and validation as a robust intrusion detection system testbed

Authors: Akram Boukhamla; Javier Coronel Gaviro

Addresses: Faculty of New Technologies of Information and Communication, Kasdi Merbah University, Ouargla BP.511, 30000, Algeria ' Signal Processing Applications Group of Signals, Systems and Radiocommunications, Department ETSI de Telecomunicación, Universidad Politécnica de Madrid, Avda. Complutense, 30. 28040 Madrid, Spain

Abstract: Nowadays, network security represents a huge challenge on the fight against new sophisticated attacks. Many intrusion detection systems (IDS) have been developed and improved to prevent not allowed access from malicious intruders. Developing and evaluating accurate IDS involve the use of varied datasets that collect most relevant features and real data from up-to-date types of attacks to real hardware and software scenarios. This paper describes and optimises a new dataset available called CICIDS2017 (CICIDS2017, 2017). Using principal component analysis (PCA) for the optimisation process of the CICIDS2017 dataset, the dimensionality of the features and records have been reduced without losing specificity and sensitivity, thus, reducing the overall size and leading to faster IDS. Finally, the optimised CICIDS2017 dataset is evaluated using three well known classifiers (KNN, C4.5 and naïve Bayes). The results obtained show that the optimised dataset maintain the same specificity and sensitivity of the non-optimised version.

Keywords: intrusion detection system; IDS; network security; network attacks; CICIDS2017; principal component analysis; PCA; machine learning.

DOI: 10.1504/IJICS.2021.117392

International Journal of Information and Computer Security, 2021 Vol.16 No.1/2, pp.20 - 32

Received: 23 Feb 2018
Accepted: 06 Sep 2018

Published online: 27 Aug 2021 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article