Authors: Ahmad Azab
Addresses: College of Engineering and Technology, American University of the Middle East, Kuwait
Abstract: Malware is still identified as a serious threat on the internet and considered the main tool utilised by cybercriminals to conduct their malicious actions against corporations, government agencies and individuals. Malware authors embed numerous techniques, such as obfuscation and morphing, to avoid detection by anti-virus engines and facing hardened zero-day detection. To address this problem, we propose a solution that groups malware binaries belonging to the same variant, regardless of whether they are packed or not. Our approach deploys similarity measures between the malware binaries of the same variant by applying data mining concepts in conjunction with hashing algorithms. In this paper, we assess trend locality sensitive hashing (TLSH) and SSDEEP hashing algorithms to group packed and unpacked binaries of the same variants, deploying K-NN learning algorithm. Two Zeus variants are used - Mal ZBOT and TSPY ZBOT - to address the effectiveness of the proposed approach. The experimental results reflect our method's effectiveness in grouping binaries of the same variant, its resilience to common obfuscations used by cybercriminals and a poor performance with regard to applying the hashing algorithm without the data mining concept. The best result attained over both packed and unpacked binaries is 0.982 F-measure.
Keywords: malware; hashing; data mining; Zeus.
International Journal of Security and Networks, 2020 Vol.15 No.3, pp.123 - 132
Received: 13 Jun 2019
Accepted: 04 Aug 2019
Published online: 11 Sep 2020 *