Title: Enhanced phishing URL identification using an integrated attention-based LSTM-CNN with hybrid features

Authors: Santosh Kumar Birthriya; Priyanka Ahlawat; Ankit Kumar Jain

Addresses: National Institute of Technology, Kurukshetra-136119, Haryana, India ' National Institute of Technology, Kurukshetra-136119, Haryana, India ' National Institute of Technology, Kurukshetra-136119, Haryana, India

Abstract: Phishing attacks continue to pose a significant threat to online security, targeting users' personal and financial information through deceptive URLs and websites. This study proposes a robust hybrid deep learning model for phishing URL detection. Our approach follows a multi-step methodology, including URL data pre-processing, advanced feature engineering, and the application of deep learning techniques for precise URL classification. Feature engineering incorporates TF-IDF vectorisation, principal component analysis, and natural language processing based feature extraction, forming a comprehensive hybrid feature set that improves detection accuracy. Experiment results reveal that hybrid features significantly enhance the performance of deep learning models, with the proposed LSTM-CNN with attention model achieving the highest accuracy at 99.92%. This research underscores the potential for advanced hybrid architectures in cybersecurity applications, and efficient solution for real-time phishing detection.

Keywords: phishing; natural language processing; NLP; long short-term memory; LSTM; convolutional neural network; CNN; cybersecurity.

DOI: 10.1504/IJSN.2025.145035

International Journal of Security and Networks, 2025 Vol.20 No.1, pp.8 - 22

Received: 19 Nov 2023
Accepted: 13 Nov 2024

Published online: 17 Mar 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article