Title: Malicious webpages analysis and detection algorithm based on BiLSTM
Authors: Huan-Huan Wang; Long Yu; Sheng-Wei Tian; Shi-Qi Luo; Xin-Jun Pei
Addresses: School of Software, Xinjiang University, No. 499, Xibei Road, Saybagh District, Urumqi, Xinjiang 830008, China ' Network Center, Xinjiang University, No.666, Shengli Road, Tianshan District, Urumqi, Xinjiang 830046, China ' School of Software, Xinjiang University, No. 499, Xibei Road, Saybagh District, Urumqi, Xinjiang 830008, China; School of Information Science and Engineering, Xinjiang University, No. 666, Shengli Road, Tianshan District, Urumqi, Xinjiang 830046, China ' School of Software, Xinjiang University, No. 499, Xibei Road, Saybagh District, Urumqi, Xinjiang 830008, China ' School of Information Science and Engineering, Xinjiang University, No. 666, Shengli Road, Tianshan District, Urumqi, Xinjiang 830046, China
Abstract: This paper proposes a bidirectional long short-term memory (BiLSTM) malicious webpages analysis and detection algorithm. Through the research on the characteristics of malicious webpages analysis and detection, the 'texture image' feature used to express the similarity of malicious webpages URL binary files is extracted; besides, the host information features and URL information features are extracted. The 'texture image' feature is integrated with host information features and URL information features, and a deep learning method of BiLSTM is used to analyse and detect malicious webpages. Compare to LSTM algorithm, k-nearest neighbourhood (KNN), IndRNN, CNN and Gaussian Bayes algorithm (Gaussian NB), the experimental results show that the algorithm has higher accuracy than the traditional model.
Keywords: webpages; bidirectional long short-term memory; BiLSTM; texture image; deep learning.
International Journal of Electronic Business, 2020 Vol.15 No.4, pp.351 - 367
Received: 19 Sep 2018
Accepted: 18 Mar 2019
Published online: 09 Nov 2020 *