Development of Ensemble SVM–LSTM Model for Phishing Website Detection

Abubakar L. IBRAHEEM; John K. ALHASSAN; Noel D. MOSES; Suleiman AHMAD

doi:10.33003/wdpbpj85

Development of Ensemble SVM–LSTM Model for Phishing Website Detection

Authors

Abubakar L. IBRAHEEM

Department of Cyber Security Science, Federal University of Technology, Minna, Nigeria

Author
John K. ALHASSAN

Department of Cyber Security Science, Federal University of Technology, Minna, Nigeria

Author
Noel D. MOSES

Department of Cyber Security Science, Federal University of Technology, Minna, Nigeria

Author
Suleiman AHMAD

Department of Cyber Security Science, Federal University of Technology, Minna, Nigeria

Author

DOI:

https://doi.org/10.33003/wdpbpj85

Keywords:

Phishing, Cybercrime, Cyber Threats, Identity theft, Ensemble Learning.

Abstract

Phishing is a criminal mechanism employing social engineering techniques to exploit human vulnerabilities and technical loopholes. This is to deceive users into divulging sensitive information, which are in turn used for fraudulent activities. Meanwhile, many machine learning approaches have been proposed in the literature, they often struggle with scalability, adaptability to emerging threats and the trade-off between detection accuracy and computational efficiency. This study aims to enhance phishing website detection through the implementation of an ensemble model that integrate Support Vector Machine (SVM) and Long Short-Term Memory (LSTM) networks. The methodology involves collecting a diverse dataset of phishing and legitimate URLs with a total of 343,525 records. This includes phishing URLs downloaded in CSV file format from PhishTank and Kaggle comprising a total of 167,582 records and legitimate URLs downloaded from the UC Irvine Machine Learning Repository and Kaggle consisting of 175,943 records. Static and Sequential features were extracted. 49 features were extracted and 15 were selected as the most relevant features using Recursive Feature Elimination (RFE) and univariate statistical tests. An ensemble architecture integrating SVM and LSTM networks was then trained using the selected features, employing stratified k-fold cross-validation. The results demonstrate that the proposed approach achieves high detection accuracy of 97.58%, precision 93.54%, recall 96.1% and F1- Score 95.78%, outperforming traditional models and various benchmark classifiers. The findings highlight the effectiveness of combining static and sequential features within an ensemble framework to improve the generalization and robustness of phishing detection systems. Exploring deep learning architectures like CNNs, LSTMs, and GANs in a hybrid framework will likely boost detection capabilities against evolving cyber threats.

References

Cover Image

Downloads

FJET_21_2_2

Published

02-03-2026

Issue

Vol. 2 No. 1 (2026): June 2026

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

How to Cite

[1]

Abubakar L. IBRAHEEM, John K. ALHASSAN, Noel D. MOSES, and Suleiman AHMAD, “Development of Ensemble SVM–LSTM Model for Phishing Website Detection”, FJET, vol. 2, no. 1, pp. 21–30, Mar. 2026, doi: 10.33003/wdpbpj85.

Download Citation

Development of Ensemble SVM–LSTM Model for Phishing Website Detection

How to Cite

Similar Articles

Most read articles by the same author(s)

Similar Articles

Short Messaging Service Spam Detection Model Using Natural Language Processing and Deep Learning Techniques

Machine Learning Based Feature Selection for Early Detection of Thyroid Disorders in Nigeria

Image Denoising: An Overview of Noise Model, Denoising Methods and Applications

Application of Machine Learning for Enhancing Fake Logo Detection

Beyond Overload: Assessing Cognitive Load to Facilitate Learning Transfer in Virtual Environments

Machine Learning Models for Predicting Flow Rate for Niger Delta Oil Wells

Machine Learning-Driven Recruitment Recommendation System for Employment in Nigerian Universities

Predicting Requirement Change Using Bayesian Networks on Historical Traceability Data

Conserving Indigenous Geometries: A Vital Approach to Integrating Cultural Heritage into Architectural Education

A Data-Driven Surrogate Framework for Economic Optimization of Thin Oil Rim Developments: A Comprehensive Methodological Review and Niger Delta Application