Authors: Dharmaraj R. Patil; Jayantrao B. Patil
Addresses: Department of Computer Engineering, R.C. Patel Institute of Technology, Shirpur, Maharashtra, India ' Department of Computer Engineering, R.C. Patel Institute of Technology, Shirpur, Maharashtra, India
Abstract: In recent years, researchers have provided significant solutions to detect malicious web pages, still there are many open issues. This paper proposes a methodology for the effective detection of malicious web pages using feature selection methods and machine learning. Our methodology consists of three modules: feature selection, training and classification. To evaluate our methodology, six feature selection methods and eight supervised machine learning classifiers are used. Experiments are performed on the balanced binary dataset. It is found that by using feature selection methods, the classifiers achieved significant detection accuracy of 94-99% and above, error-rate of 0.19-5.55%, FPR of 0.006-0.094, FNR of 0.000-0.013 with minimum system overhead. Our multi-model system using majority voting classifier and wrapper+Naive Bayes feature selection method with GreedyStepwise search technique using only 15 features achieved a highest accuracy of 99.15%, FPR of 0.017 and FNR of 0.000.
Keywords: malicious web pages; feature selection; machine learning; supervised learning; multi-model system; web security; cyber security.
International Journal of High Performance Computing and Networking, 2019 Vol.14 No.4, pp.473 - 488
Received: 18 Dec 2017
Accepted: 25 Jun 2018
Published online: 19 Sep 2019 *