Explainable phishing website detection for secure and sustainable cyber infrastructure
<p dir="ltr">Phishing is a social engineering attack and a type of cybercrime that is dangerously and constantly on the rise. Phishing attacks can impact various sectors, including governmental, social, financial, and individual businesses. Traditional methods of identifying phishing...
محفوظ في:
| المؤلف الرئيسي: | |
|---|---|
| مؤلفون آخرون: | , , , , , |
| منشور في: |
2025
|
| الموضوعات: | |
| الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
| _version_ | 1864513536809500672 |
|---|---|
| author | Tanzila Kehkashan (20748842) |
| author2 | Maha Abdelhaq (735574) Ahmad Sami Al-Shamayleh (17541495) Nazish Huda (22682342) Imran Ashraf Yaseen (22682345) Abdelmuttlib Ibrahim Abdalla Ahmed (22682348) Adnan Akhunzada (20151648) |
| author2_role | author author author author author author |
| author_facet | Tanzila Kehkashan (20748842) Maha Abdelhaq (735574) Ahmad Sami Al-Shamayleh (17541495) Nazish Huda (22682342) Imran Ashraf Yaseen (22682345) Abdelmuttlib Ibrahim Abdalla Ahmed (22682348) Adnan Akhunzada (20151648) |
| author_role | author |
| dc.creator.none.fl_str_mv | Tanzila Kehkashan (20748842) Maha Abdelhaq (735574) Ahmad Sami Al-Shamayleh (17541495) Nazish Huda (22682342) Imran Ashraf Yaseen (22682345) Abdelmuttlib Ibrahim Abdalla Ahmed (22682348) Adnan Akhunzada (20151648) |
| dc.date.none.fl_str_mv | 2025-11-25T03:00:00Z |
| dc.identifier.none.fl_str_mv | 10.1038/s41598-025-27984-w |
| dc.relation.none.fl_str_mv | https://figshare.com/articles/journal_contribution/Explainable_phishing_website_detection_for_secure_and_sustainable_cyber_infrastructure/31995087 |
| dc.rights.none.fl_str_mv | CC BY 4.0 info:eu-repo/semantics/openAccess |
| dc.subject.none.fl_str_mv | Information and computing sciences Cybersecurity and privacy Data management and data science Machine learning Machine learning Phishing website detection RF SHAP URL |
| dc.title.none.fl_str_mv | Explainable phishing website detection for secure and sustainable cyber infrastructure |
| dc.type.none.fl_str_mv | Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal |
| description | <p dir="ltr">Phishing is a social engineering attack and a type of cybercrime that is dangerously and constantly on the rise. Phishing attacks can impact various sectors, including governmental, social, financial, and individual businesses. Traditional methods of identifying phishing websites, such as blacklist and heuristic approaches, often fail to provide sufficient protection. Moreover, traditional techniques that combine URLs, webpage content, and external features are time-consuming, require substantial computing power, and are unsuitable for devices with limited resources. Moreover, previous research has often overlooked the critical role of identifying which features are important for detection and their impact on outcomes. Traditional methods might not fully capture the significance of individual features. To overcome this issue, this research applies feature selection techniques, specifically shapley additive explanations, with each model based primarily on the URL to improve the detection process. A dataset with over 11000+ URLs and 30 varied features of the ”Phishing Website Detection” was applied from the Kaggle repository. Then, the models, namely support vector machine (SVM), random forest (RF), decision tree (DT), logistic regression(LR), and K-nearest neighbor, were trained and tested. Each model used shapely additive explanations (SHAP) to improve precision and interpretability by highlighting the most important features. It was tested using some key performance metrics such as accuracy, precision, recall, and F1 score. Compared to all the models that were tested, this random forest model indicates 97% accuracy. The proposed system offers an overall and interpretable solution for phishing detection that contributes to a safer digital environment.</p><h2 dir="ltr">Other Information</h2><p dir="ltr">Published in: Scientific Reports<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1038/s41598-025-27984-w" target="_blank">https://dx.doi.org/10.1038/s41598-025-27984-w</a></p> |
| eu_rights_str_mv | openAccess |
| id | Manara2_b674393dd9c0702135e2ef3b573cd2a4 |
| identifier_str_mv | 10.1038/s41598-025-27984-w |
| network_acronym_str | Manara2 |
| network_name_str | Manara2 |
| oai_identifier_str | oai:figshare.com:article/31995087 |
| publishDate | 2025 |
| repository.mail.fl_str_mv | |
| repository.name.fl_str_mv | |
| repository_id_str | |
| rights_invalid_str_mv | CC BY 4.0 |
| spelling | Explainable phishing website detection for secure and sustainable cyber infrastructureTanzila Kehkashan (20748842)Maha Abdelhaq (735574)Ahmad Sami Al-Shamayleh (17541495)Nazish Huda (22682342)Imran Ashraf Yaseen (22682345)Abdelmuttlib Ibrahim Abdalla Ahmed (22682348)Adnan Akhunzada (20151648)Information and computing sciencesCybersecurity and privacyData management and data scienceMachine learningMachine learningPhishing website detectionRFSHAPURL<p dir="ltr">Phishing is a social engineering attack and a type of cybercrime that is dangerously and constantly on the rise. Phishing attacks can impact various sectors, including governmental, social, financial, and individual businesses. Traditional methods of identifying phishing websites, such as blacklist and heuristic approaches, often fail to provide sufficient protection. Moreover, traditional techniques that combine URLs, webpage content, and external features are time-consuming, require substantial computing power, and are unsuitable for devices with limited resources. Moreover, previous research has often overlooked the critical role of identifying which features are important for detection and their impact on outcomes. Traditional methods might not fully capture the significance of individual features. To overcome this issue, this research applies feature selection techniques, specifically shapley additive explanations, with each model based primarily on the URL to improve the detection process. A dataset with over 11000+ URLs and 30 varied features of the ”Phishing Website Detection” was applied from the Kaggle repository. Then, the models, namely support vector machine (SVM), random forest (RF), decision tree (DT), logistic regression(LR), and K-nearest neighbor, were trained and tested. Each model used shapely additive explanations (SHAP) to improve precision and interpretability by highlighting the most important features. It was tested using some key performance metrics such as accuracy, precision, recall, and F1 score. Compared to all the models that were tested, this random forest model indicates 97% accuracy. The proposed system offers an overall and interpretable solution for phishing detection that contributes to a safer digital environment.</p><h2 dir="ltr">Other Information</h2><p dir="ltr">Published in: Scientific Reports<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1038/s41598-025-27984-w" target="_blank">https://dx.doi.org/10.1038/s41598-025-27984-w</a></p>2025-11-25T03:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1038/s41598-025-27984-whttps://figshare.com/articles/journal_contribution/Explainable_phishing_website_detection_for_secure_and_sustainable_cyber_infrastructure/31995087CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/319950872025-11-25T03:00:00Z |
| spellingShingle | Explainable phishing website detection for secure and sustainable cyber infrastructure Tanzila Kehkashan (20748842) Information and computing sciences Cybersecurity and privacy Data management and data science Machine learning Machine learning Phishing website detection RF SHAP URL |
| status_str | publishedVersion |
| title | Explainable phishing website detection for secure and sustainable cyber infrastructure |
| title_full | Explainable phishing website detection for secure and sustainable cyber infrastructure |
| title_fullStr | Explainable phishing website detection for secure and sustainable cyber infrastructure |
| title_full_unstemmed | Explainable phishing website detection for secure and sustainable cyber infrastructure |
| title_short | Explainable phishing website detection for secure and sustainable cyber infrastructure |
| title_sort | Explainable phishing website detection for secure and sustainable cyber infrastructure |
| topic | Information and computing sciences Cybersecurity and privacy Data management and data science Machine learning Machine learning Phishing website detection RF SHAP URL |