Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques

<p dir="ltr">Nowadays, healthcare is the prime need of every human being in the world, and clinical datasets play an important role in developing an intelligent healthcare system for monitoring the health of people. Mostly, the real-world datasets are inherently class imbalanced, cli...

Full description

Saved in:
Bibliographic Details
Main Author: Vinod Kumar (48743) (author)
Other Authors: Gotam Singh Lalotra (17542062) (author), Ponnusamy Sasikala (17542065) (author), Dharmendra Singh Rajput (17542068) (author), Rajesh Kaluri (17541486) (author), Kuruva Lakshmanna (17542071) (author), Mohammad Shorfuzzaman (17542050) (author), Abdulmajeed Alsufyani (276154) (author), Mueen Uddin (4903510) (author)
Published: 2022
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1864513531095810048
author Vinod Kumar (48743)
author2 Gotam Singh Lalotra (17542062)
Ponnusamy Sasikala (17542065)
Dharmendra Singh Rajput (17542068)
Rajesh Kaluri (17541486)
Kuruva Lakshmanna (17542071)
Mohammad Shorfuzzaman (17542050)
Abdulmajeed Alsufyani (276154)
Mueen Uddin (4903510)
author2_role author
author
author
author
author
author
author
author
author_facet Vinod Kumar (48743)
Gotam Singh Lalotra (17542062)
Ponnusamy Sasikala (17542065)
Dharmendra Singh Rajput (17542068)
Rajesh Kaluri (17541486)
Kuruva Lakshmanna (17542071)
Mohammad Shorfuzzaman (17542050)
Abdulmajeed Alsufyani (276154)
Mueen Uddin (4903510)
author_role author
dc.creator.none.fl_str_mv Vinod Kumar (48743)
Gotam Singh Lalotra (17542062)
Ponnusamy Sasikala (17542065)
Dharmendra Singh Rajput (17542068)
Rajesh Kaluri (17541486)
Kuruva Lakshmanna (17542071)
Mohammad Shorfuzzaman (17542050)
Abdulmajeed Alsufyani (276154)
Mueen Uddin (4903510)
dc.date.none.fl_str_mv 2022-07-13T03:00:00Z
dc.identifier.none.fl_str_mv 10.3390/healthcare10071293
dc.relation.none.fl_str_mv https://figshare.com/articles/journal_contribution/Addressing_Binary_Classification_over_Class_Imbalanced_Clinical_Datasets_Using_Computationally_Intelligent_Techniques/24717522
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Health sciences
Health services and systems
Information and computing sciences
Artificial intelligence
Data management and data science
Machine learning
classification
balancing techniques
clinical dataset
machine learning
dc.title.none.fl_str_mv Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
dc.type.none.fl_str_mv Text
Journal contribution
info:eu-repo/semantics/publishedVersion
text
contribution to journal
description <p dir="ltr">Nowadays, healthcare is the prime need of every human being in the world, and clinical datasets play an important role in developing an intelligent healthcare system for monitoring the health of people. Mostly, the real-world datasets are inherently class imbalanced, clinical datasets also suffer from this imbalance problem, and the imbalanced class distributions pose several issues in the training of classifiers. Consequently, classifiers suffer from low accuracy, precision, recall, and a high degree of misclassification, etc. We performed a brief literature review on the class imbalanced learning scenario. This study carries the empirical performance evaluation of six classifiers, namely Decision Tree, k-Nearest Neighbor, Logistic regression, Artificial Neural Network, Support Vector Machine, and Gaussian Naïve Bayes, over five imbalanced clinical datasets, Breast Cancer Disease, Coronary Heart Disease, Indian Liver Patient, Pima Indians Diabetes Database, and Coronary Kidney Disease, with respect to seven different class balancing techniques, namely Undersampling, Random oversampling, SMOTE, ADASYN, SVM-SMOTE, SMOTEEN, and SMOTETOMEK. In addition to this, the appropriate explanations for the superiority of the classifiers as well as data-balancing techniques are also explored. Furthermore, we discuss the possible recommendations on how to tackle the class imbalanced datasets while training the different supervised machine learning methods. Result analysis demonstrates that SMOTEEN balancing method often performed better over all the other six data-balancing techniques with all six classifiers and for all five clinical datasets. Except for SMOTEEN, all other six balancing techniques almost had equal performance but moderately lesser performance than SMOTEEN.</p><h2>Other Information</h2><p dir="ltr">Published in: Healthcare<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.3390/healthcare10071293" target="_blank">https://dx.doi.org/10.3390/healthcare10071293</a></p><p dir="ltr">Disclaimer: The University of Doha for Science and Technology replaced the now-former College of the North Atlantic-Qatar after an Amiri decision in 2022. UDST has become and first national applied University in Qatar; it is also second national University in the country.</p>
eu_rights_str_mv openAccess
id Manara2_9306360a964189071276cd27a035c049
identifier_str_mv 10.3390/healthcare10071293
network_acronym_str Manara2
network_name_str Manara2
oai_identifier_str oai:figshare.com:article/24717522
publishDate 2022
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent TechniquesVinod Kumar (48743)Gotam Singh Lalotra (17542062)Ponnusamy Sasikala (17542065)Dharmendra Singh Rajput (17542068)Rajesh Kaluri (17541486)Kuruva Lakshmanna (17542071)Mohammad Shorfuzzaman (17542050)Abdulmajeed Alsufyani (276154)Mueen Uddin (4903510)Health sciencesHealth services and systemsInformation and computing sciencesArtificial intelligenceData management and data scienceMachine learningclassificationbalancing techniquesclinical datasetmachine learning<p dir="ltr">Nowadays, healthcare is the prime need of every human being in the world, and clinical datasets play an important role in developing an intelligent healthcare system for monitoring the health of people. Mostly, the real-world datasets are inherently class imbalanced, clinical datasets also suffer from this imbalance problem, and the imbalanced class distributions pose several issues in the training of classifiers. Consequently, classifiers suffer from low accuracy, precision, recall, and a high degree of misclassification, etc. We performed a brief literature review on the class imbalanced learning scenario. This study carries the empirical performance evaluation of six classifiers, namely Decision Tree, k-Nearest Neighbor, Logistic regression, Artificial Neural Network, Support Vector Machine, and Gaussian Naïve Bayes, over five imbalanced clinical datasets, Breast Cancer Disease, Coronary Heart Disease, Indian Liver Patient, Pima Indians Diabetes Database, and Coronary Kidney Disease, with respect to seven different class balancing techniques, namely Undersampling, Random oversampling, SMOTE, ADASYN, SVM-SMOTE, SMOTEEN, and SMOTETOMEK. In addition to this, the appropriate explanations for the superiority of the classifiers as well as data-balancing techniques are also explored. Furthermore, we discuss the possible recommendations on how to tackle the class imbalanced datasets while training the different supervised machine learning methods. Result analysis demonstrates that SMOTEEN balancing method often performed better over all the other six data-balancing techniques with all six classifiers and for all five clinical datasets. Except for SMOTEEN, all other six balancing techniques almost had equal performance but moderately lesser performance than SMOTEEN.</p><h2>Other Information</h2><p dir="ltr">Published in: Healthcare<br>License: <a href="https://creativecommons.org/licenses/by/4.0/" target="_blank">https://creativecommons.org/licenses/by/4.0/</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.3390/healthcare10071293" target="_blank">https://dx.doi.org/10.3390/healthcare10071293</a></p><p dir="ltr">Disclaimer: The University of Doha for Science and Technology replaced the now-former College of the North Atlantic-Qatar after an Amiri decision in 2022. UDST has become and first national applied University in Qatar; it is also second national University in the country.</p>2022-07-13T03:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.3390/healthcare10071293https://figshare.com/articles/journal_contribution/Addressing_Binary_Classification_over_Class_Imbalanced_Clinical_Datasets_Using_Computationally_Intelligent_Techniques/24717522CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/247175222022-07-13T03:00:00Z
spellingShingle Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
Vinod Kumar (48743)
Health sciences
Health services and systems
Information and computing sciences
Artificial intelligence
Data management and data science
Machine learning
classification
balancing techniques
clinical dataset
machine learning
status_str publishedVersion
title Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
title_full Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
title_fullStr Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
title_full_unstemmed Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
title_short Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
title_sort Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
topic Health sciences
Health services and systems
Information and computing sciences
Artificial intelligence
Data management and data science
Machine learning
classification
balancing techniques
clinical dataset
machine learning