Title: Indirect classification approaches: a comparative study in network intrusion detection

Authors: Taghi M. Khoshgoftaar, Kehan Gao, Hua Lin

Addresses: Department of Computer Science and Engineering, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA. ' Department of Mathematics and Computer Science, Eastern Connecticut State University, 83 Windham Street, Willimantic, CT 06226, USA. ' Department of Computer Science and Engineering, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431, USA

Abstract: The application of data mining and machine learning techniques to the network intrusion detection domain has recently gained importance. This paper presents a set of indirect classification techniques for addressing the multi-category classification problem in network intrusion detection. In contrast to indirect classification techniques, direct classification techniques generally extend associated binary classifiers to handle multi-category classification problems. An indirect classification technique decomposes the original multi-category problem into multiple binary classification problems based on some criteria. We investigate the one vs. one and one vs. rest approaches for building the binary classifiers, the results of which are then merged using a combining strategy. Three different combining strategies are investigated in our study, and they are Hamming decoding, loss-based decoding, and soft-max function. Consequently, we evaluate six different indirect classification techniques in our study. To our knowledge, there are no existing works that evaluate as many indirect classification techniques. The six indirect classification approaches are investigated and relatively evaluated in the context of DARPA KDD–1999 offline intrusion detection project. Our empirical evaluation indicated that among the binarisation techniques, the one vs. one technique yielded generally better results; while among the combining strategies, the loss-based decoding and Hamming-decoding techniques yielded better results than the soft-max function. This study demonstrates the usefulness of the indirect classification approach for network intrusion detection.

Keywords: network intrusion detection; multi-group classifications; binary classifiers; indirect combining techniques; Hamming decoding; loss-based decoding; soft-max function; data mining; indirect classification.

DOI: 10.1504/IJCAT.2006.011995

International Journal of Computer Applications in Technology, 2006 Vol.27 No.4, pp.232 - 245

Published online: 08 Jan 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article