Title: A malware variants detection methodology with an opcode-based feature learning method and a fast density-based clustering algorithm

Authors: Hui Yin; Jixin Zhang; Zheng Qin

Addresses: Hunan Provincial Key Laboratory of Network Investigational Technology, College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan, China ' College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China ' College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China

Abstract: Malware is one of the most terrible and major security threats facing the internet today, which can be defined as any type of malicious code to harm a computer or network. As malware variants can be equipped with sophisticated mechanisms to bypass traditional detection systems, in this paper, we propose a malware variant detection approach that can automatically, rapidly and accurately detect malware variants. In our approach, we present an asynchronous architecture for automated training and detection. Under this architecture, to improve the detection speed while retaining the accuracy, we propose an information entropy-based feature extraction method to extract a few but very useful features and a distance-based weight learning method to weight these features. To further improve the detection speed, we propose our fast density-based clustering algorithm. We evaluate our approach with a number of Windows-based malware instances which belong to six large families, and our experiments demonstrate that our automated malware variant detection method is able to achieve high accuracy with a significant speedup compared with the other state-of-art approaches.

Keywords: distance-based weight learning; fast density-based clustering; FDBC; information entropy; malware variants.

DOI: 10.1504/IJCSE.2020.105209

International Journal of Computational Science and Engineering, 2020 Vol.21 No.1, pp.19 - 29

Received: 08 Dec 2016
Accepted: 07 Nov 2017

Published online: 11 Feb 2020 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article