Title: NITIDS: a robust network intrusion dataset

Authors: Santosh Kumar Sahu; Durga Prasad Mohapatra; Sanjaya Kumar Panda

Addresses: Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Odisha, India ' Department of Computer Science and Engineering, National Institute of Technology, Rourkela, Odisha, India ' Department of Computer Science and Engineering, National Institute of Technology, Warangal, Telangana, India

Abstract: In predictive analytics, many multi-disciplinary techniques have been used to analyse the known data in order to make a prediction about the unknown data. For this, an enormous amount of processed data is required to analyse the same irrespective of the domain. Specifically, most of the researchers use superannuated signatures in intrusion detection modelling. This model is unable to train the recent attacks' signatures; hence it is not useful to detect the modern sophisticated attacks. Moreover, it is essential to infer the statistical properties of the dataset and select the samples of the dataset watchfully, to achieve high accuracy, low training and generalisation error. In this paper, we create National Institute of Technology Intrusion Detection System (NITIDS) dataset, an open intrusion dataset that contains the signatures of recent attacks. The dataset contains 60 recent signatures of denial of service, probing, user to root and root to local attacks. The dataset is useful in terms of availability, recent signatures, real-time traffic and openly available for the research community. In the simulation, we process the acquired data with a proposed data preprocessing technique, which deals with missing value imputation, handling redundant samples, balancing data distribution and removing outliers of the dataset.

Keywords: intrusion dataset; data preprocessing; imbalanced dataset; KDDCup99; GureKDD; NSL-KDD.

DOI: 10.1504/IJES.2021.117951

International Journal of Embedded Systems, 2021 Vol.14 No.4, pp.391 - 408

Received: 08 Apr 2020
Accepted: 13 Aug 2020

Published online: 05 Oct 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article