XAI-PhD: Fortifying Trust of Phishing URL Detection Empowered by Shapley Additive Explanations

Mustafa Al-Fayoumi; Bushra Alhijawi; Qasem Abu Al-Haija; Rakan Armoush

doi:10.3991/ijoe.v20i11.49533

Authors

Mustafa Al-Fayoumi https://orcid.org/0000-0002-3129-5193
Bushra Alhijawi https://orcid.org/0000-0003-0806-102X
Qasem Abu Al-Haija Jordan University of Science and Technology https://orcid.org/0000-0003-2422-0297
Rakan Armoush https://orcid.org/0009-0008-3223-0552

DOI:

https://doi.org/10.3991/ijoe.v20i11.49533

Keywords:

Cybersecurity, Machine Learning, Explainable AI, Phishing Detection; Feature Engineering; Malicious URLs; SHapley Additive exPlanations (SHAP).

Abstract

The rapid growth of the Internet has led to an increased demand for online services. However, this surge in online activity has also brought about a new threat: phishing attacks. Phishing is a type of cyberattack that utilizes social engineering techniques and technological manipulations to steal crucial information from unsuspecting individuals. Consequently, there is a rising necessity to create dependable phishing URL detection models that can effectively identify phishing URLs with enhanced accuracy and reduced prediction overhead. This study introduces XAI-PhD, an innovative phishing detection method that utilizes machine learning (ML) and Shapley additive explanation (SHAP) capabilities. Specifically, XAI-PhD utilizes SHAP to thoroughly analyze the significance of each feature in influencing the decision-making process of the classifier. By selectively incorporating input characteristics based on their SHAP values, only the most crucial attributes are assessed, enabling the development of a highly adaptable and generalized model. XAI-PhD utilizes a lightweight gradient boosting machine as its classifier, and a series of rigorous tests are conducted to assess its performance compared to established baseline methods. The empirical findings unequivocally demonstrate the exceptional effectiveness of XAI-PhD, as evidenced by its remarkable accuracy and F1-score of 99.8% and 99%, respectively. Moreover, XAI-PhD exhibits high computational efficiency, requiring only 1.47 milliseconds and 18.5 microseconds per record to generate accurate predictions.