Title: Semi-supervised and compound classification of network traffic

Authors: Jun Zhang; Chao Chen; Yang Xiang; Wanlei Zhou

Addresses: School of Information Technology, Deakin University, Melbourne, Australia ' School of Information Technology, Deakin University, Melbourne, Australia ' School of Information Technology, Deakin University, Melbourne, Australia ' School of Information Technology, Deakin University, Melbourne, Australia

Abstract: This paper presents a new semi-supervised method to effectively improve traffic classification performance when very few supervised training data are available. Existing semi-supervised methods label a large proportion of testing flows as unknown flows due to limited supervised information, which severely affects the classification performance. To address this problem, we propose to incorporate flow correlation into both training and testing stages. At the training stage, we make use of flow correlation to extend the supervised data set by automatically labelling unlabelled flows according to their correlation to the pre-labelled flows. Consequently, a traffic classifier achieves excellent performance because of the enhanced training data set. At the testing stage, the correlated flows are identified and classified jointly by combining their individual predictions, so as to further boost the classification accuracy. The empirical study on the real-world network traffic shows that the proposed method significantly outperforms the state-of-the-art flow statistical feature based classification methods.

Keywords: traffic classification; semi-supervised classification; network traffic; compound classifiers; network security; flow correlation; classification accuracy.

DOI: 10.1504/IJSN.2012.053463

International Journal of Security and Networks, 2012 Vol.7 No.4, pp.252 - 261

Published online: 25 Apr 2013 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article