Explainable phishing website detection for secure and sustainable cyber infrastructure

<p dir="ltr">Phishing is a social engineering attack and a type of cybercrime that is dangerously and constantly on the rise. Phishing attacks can impact various sectors, including governmental, social, financial, and individual businesses. Traditional methods of identifying phishing...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Tanzila Kehkashan (20748842) (author)
مؤلفون آخرون: Maha Abdelhaq (735574) (author), Ahmad Sami Al-Shamayleh (17541495) (author), Nazish Huda (22682342) (author), Imran Ashraf Yaseen (22682345) (author), Abdelmuttlib Ibrahim Abdalla Ahmed (22682348) (author), Adnan Akhunzada (20151648) (author)
منشور في: 2025
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1864513536809500672
author Tanzila Kehkashan (20748842)
author2 Maha Abdelhaq (735574)
Ahmad Sami Al-Shamayleh (17541495)
Nazish Huda (22682342)
Imran Ashraf Yaseen (22682345)
Abdelmuttlib Ibrahim Abdalla Ahmed (22682348)
Adnan Akhunzada (20151648)
author2_role author
author
author
author
author
author
author_facet Tanzila Kehkashan (20748842)
Maha Abdelhaq (735574)
Ahmad Sami Al-Shamayleh (17541495)
Nazish Huda (22682342)
Imran Ashraf Yaseen (22682345)
Abdelmuttlib Ibrahim Abdalla Ahmed (22682348)
Adnan Akhunzada (20151648)
author_role author
dc.creator.none.fl_str_mv Tanzila Kehkashan (20748842)
Maha Abdelhaq (735574)
Ahmad Sami Al-Shamayleh (17541495)
Nazish Huda (22682342)
Imran Ashraf Yaseen (22682345)
Abdelmuttlib Ibrahim Abdalla Ahmed (22682348)
Adnan Akhunzada (20151648)
dc.date.none.fl_str_mv 2025-11-25T03:00:00Z
dc.identifier.none.fl_str_mv 10.1038/s41598-025-27984-w
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Explainable_phishing_website_detection_for_secure_and_sustainable_cyber_infrastructure/31995087
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Information and computing sciences
Cybersecurity and privacy
Data management and data science
Machine learning
Machine learning
Phishing website detection
RF
SHAP
URL
dc.title.none.fl_str_mv Explainable phishing website detection for secure and sustainable cyber infrastructure
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Phishing is a social engineering attack and a type of cybercrime that is dangerously and constantly on the rise. Phishing attacks can impact various sectors, including governmental, social, financial, and individual businesses. Traditional methods of identifying phishing websites, such as blacklist and heuristic approaches, often fail to provide sufficient protection. Moreover, traditional techniques that combine URLs, webpage content, and external features are time-consuming, require substantial computing power, and are unsuitable for devices with limited resources. Moreover, previous research has often overlooked the critical role of identifying which features are important for detection and their impact on outcomes. Traditional methods might not fully capture the significance of individual features. To overcome this issue, this research applies feature selection techniques, specifically shapley additive explanations, with each model based primarily on the URL to improve the detection process. A dataset with over 11000+ URLs and 30 varied features of the ”Phishing Website Detection” was applied from the Kaggle repository. Then, the models, namely support vector machine (SVM), random forest (RF), decision tree (DT), logistic regression(LR), and K-nearest neighbor, were trained and tested. Each model used shapely additive explanations (SHAP) to improve precision and interpretability by highlighting the most important features. It was tested using some key performance metrics such as accuracy, precision, recall, and F1 score. Compared to all the models that were tested, this random forest model indicates 97% accuracy. The proposed system offers an overall and interpretable solution for phishing detection that contributes to a safer digital environment.</p><h2 dir="ltr">Other Information</h2><p dir="ltr">Published in: Scientific Reports<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1038/s41598-025-27984-w" target="_blank">https://dx.doi.org/10.1038/s41598-025-27984-w</a></p>
eu_rights_str_mv openAccess
id Manara2_b674393dd9c0702135e2ef3b573cd2a4
identifier_str_mv 10.1038/s41598-025-27984-w
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/31995087
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Explainable phishing website detection for secure and sustainable cyber infrastructureTanzila Kehkashan (20748842)Maha Abdelhaq (735574)Ahmad Sami Al-Shamayleh (17541495)Nazish Huda (22682342)Imran Ashraf Yaseen (22682345)Abdelmuttlib Ibrahim Abdalla Ahmed (22682348)Adnan Akhunzada (20151648)Information and computing sciencesCybersecurity and privacyData management and data scienceMachine learningMachine learningPhishing website detectionRFSHAPURL<p dir="ltr">Phishing is a social engineering attack and a type of cybercrime that is dangerously and constantly on the rise. Phishing attacks can impact various sectors, including governmental, social, financial, and individual businesses. Traditional methods of identifying phishing websites, such as blacklist and heuristic approaches, often fail to provide sufficient protection. Moreover, traditional techniques that combine URLs, webpage content, and external features are time-consuming, require substantial computing power, and are unsuitable for devices with limited resources. Moreover, previous research has often overlooked the critical role of identifying which features are important for detection and their impact on outcomes. Traditional methods might not fully capture the significance of individual features. To overcome this issue, this research applies feature selection techniques, specifically shapley additive explanations, with each model based primarily on the URL to improve the detection process. A dataset with over 11000+ URLs and 30 varied features of the ”Phishing Website Detection” was applied from the Kaggle repository. Then, the models, namely support vector machine (SVM), random forest (RF), decision tree (DT), logistic regression(LR), and K-nearest neighbor, were trained and tested. Each model used shapely additive explanations (SHAP) to improve precision and interpretability by highlighting the most important features. It was tested using some key performance metrics such as accuracy, precision, recall, and F1 score. Compared to all the models that were tested, this random forest model indicates 97% accuracy. The proposed system offers an overall and interpretable solution for phishing detection that contributes to a safer digital environment.</p><h2 dir="ltr">Other Information</h2><p dir="ltr">Published in: Scientific Reports<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1038/s41598-025-27984-w" target="_blank">https://dx.doi.org/10.1038/s41598-025-27984-w</a></p>2025-11-25T03:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1038/s41598-025-27984-whttps://figshare.com/articles/journal_contribution/Explainable_phishing_website_detection_for_secure_and_sustainable_cyber_infrastructure/31995087CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/319950872025-11-25T03:00:00Z
spellingShingle Explainable phishing website detection for secure and sustainable cyber infrastructure
Tanzila Kehkashan (20748842)
Information and computing sciences
Cybersecurity and privacy
Data management and data science
Machine learning
Machine learning
Phishing website detection
RF
SHAP
URL
status_str publishedVersion
title Explainable phishing website detection for secure and sustainable cyber infrastructure
title_full Explainable phishing website detection for secure and sustainable cyber infrastructure
title_fullStr Explainable phishing website detection for secure and sustainable cyber infrastructure
title_full_unstemmed Explainable phishing website detection for secure and sustainable cyber infrastructure
title_short Explainable phishing website detection for secure and sustainable cyber infrastructure
title_sort Explainable phishing website detection for secure and sustainable cyber infrastructure
topic Information and computing sciences
Cybersecurity and privacy
Data management and data science
Machine learning
Machine learning
Phishing website detection
RF
SHAP
URL