Title: Malicious web pages detection using feature selection techniques and machine learning

Authors: Dharmaraj R. Patil; Jayantrao B. Patil

Addresses: Department of Computer Engineering, R.C. Patel Institute of Technology, Shirpur, Maharashtra, India ' Department of Computer Engineering, R.C. Patel Institute of Technology, Shirpur, Maharashtra, India

Abstract: In recent years, researchers have provided significant solutions to detect malicious web pages, still there are many open issues. This paper proposes a methodology for the effective detection of malicious web pages using feature selection methods and machine learning. Our methodology consists of three modules: feature selection, training and classification. To evaluate our methodology, six feature selection methods and eight supervised machine learning classifiers are used. Experiments are performed on the balanced binary dataset. It is found that by using feature selection methods, the classifiers achieved significant detection accuracy of 94-99% and above, error-rate of 0.19-5.55%, FPR of 0.006-0.094, FNR of 0.000-0.013 with minimum system overhead. Our multi-model system using majority voting classifier and wrapper+Naive Bayes feature selection method with GreedyStepwise search technique using only 15 features achieved a highest accuracy of 99.15%, FPR of 0.017 and FNR of 0.000.

Keywords: malicious web pages; feature selection; machine learning; supervised learning; multi-model system; web security; cyber security.

DOI: 10.1504/IJHPCN.2019.102355

International Journal of High Performance Computing and Networking, 2019 Vol.14 No.4, pp.473 - 488

Received: 18 Dec 2017
Accepted: 25 Jun 2018

Published online: 23 Sep 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article