Title: Effective feature selection based on MANOVA

Authors: Trong-Kha Nguyen; Vu Duc Ly; Seong Oun Hwang

Addresses: Department of Electronics and Computer Engineering, Hongik University, Sejong, South Korea ' Department of Electronics and Computer Engineering, Hongik University, Sejong, South Korea ' Department of Software and Communications Engineering, Hongik University, Sejong, South Korea

Abstract: Effectiveness in classifying malware is a critical issue which can overheat a classifier or reduce performance in real-time malware detection systems. However, the effectiveness in feature selection stage was not studied so far. As effectiveness should be taken into account at the earliest possible stages, in this paper, we focus on the effectiveness of feature selection. Firstly, we perform an analysis on instruction levels which consists of most frequencies mnemonics. Secondly, we propose new methods to select effective features by MANOVA statistical tests. Furthermore, we use those selected features fed to a classifier. Our approach reduces significantly the number of features from 390 to 4, which explains 99.4% variation of the data. With the selected features, we classify malware samples and have achieved 96.2% of accuracy and 0.6% of false positive.

Keywords: malware classification; statistical analysis; security.

DOI: 10.1504/IJITST.2020.108133

International Journal of Internet Technology and Secured Transactions, 2020 Vol.10 No.4, pp.383 - 395

Received: 07 Apr 2018
Accepted: 17 Nov 2018

Published online: 03 Jul 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article