The Search process of the genetic algorithm.

<div><p>Diabetes, as an incurable lifelong chronic disease, has profound and far-reaching effects on patients. Given this, early intervention is particularly crucial, as it can not only significantly improve the prognosis of patients but also provide valuable reference information for cl...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Wenguang Li (6528113) (author)
مؤلفون آخرون: Yan Peng (104995) (author), Ke Peng (2220973) (author)
منشور في: 2024
الموضوعات:
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1852026283514396672
author Wenguang Li (6528113)
author2 Yan Peng (104995)
Ke Peng (2220973)
author2_role author
author
author_facet Wenguang Li (6528113)
Yan Peng (104995)
Ke Peng (2220973)
author_role author
dc.creator.none.fl_str_mv Wenguang Li (6528113)
Yan Peng (104995)
Ke Peng (2220973)
dc.date.none.fl_str_mv 2024-09-30T17:32:02Z
dc.identifier.none.fl_str_mv 10.1371/journal.pone.0311222.g009
dc.relation.none.fl_str_mv https://figshare.com/articles/figure/The_Search_process_of_the_genetic_algorithm_/27137111
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Science Policy
Plant Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
body mass index
stacking model based
performing lightgbm model
model &# 8217
layer stacking model
data balance processing
random forest model
model integration strategies
xlink "> diabetes
xgboost model optimized
model integration
xgboost model
random oversampling
data imbalance
significant impact
scientific basis
results show
research object
reaching effects
publicly available
powerful tool
particularly crucial
new idea
kaggle platform
genetic algorithm
early intervention
early diagnosis
also provides
also provided
dc.title.none.fl_str_mv The Search process of the genetic algorithm.
dc.type.none.fl_str_mv Image
Figure
info:eu-repo/semantics/publishedVersion
image
description <div><p>Diabetes, as an incurable lifelong chronic disease, has profound and far-reaching effects on patients. Given this, early intervention is particularly crucial, as it can not only significantly improve the prognosis of patients but also provide valuable reference information for clinical treatment. This study selected the BRFSS (Behavioral Risk Factor Surveillance System) dataset, which is publicly available on the Kaggle platform, as the research object, aiming to provide a scientific basis for the early diagnosis and treatment of diabetes through advanced machine learning techniques. Firstly, the dataset was balanced using various sampling methods; secondly, a Stacking model based on GA-XGBoost (XGBoost model optimized by genetic algorithm) was constructed for the risk prediction of diabetes; finally, the interpretability of the model was deeply analyzed using Shapley values. The results show: (1) Random oversampling, ADASYN, SMOTE, and SMOTEENN were used for data balance processing, among which SMOTEENN showed better efficiency and effect in dealing with data imbalance. (2) The GA-XGBoost model optimized the hyperparameters of the XGBoost model through a genetic algorithm to improve the model’s predictive accuracy. Combined with the better-performing LightGBM model and random forest model, a two-layer Stacking model was constructed. This model not only outperforms single machine learning models in predictive effect but also provides a new idea and method in the field of model integration. (3) Shapley value analysis identified features that have a significant impact on the prediction of diabetes, such as age and body mass index. This analysis not only enhances the transparency of the model but also provides more precise treatment decision support for doctors and patients. In summary, this study has not only improved the accuracy of predicting the risk of diabetes by adopting advanced machine learning techniques and model integration strategies but also provided a powerful tool for the early diagnosis and personalized treatment of diabetes.</p></div>
eu_rights_str_mv openAccess
id Manara_74b0262e2e39fb6b809df36f6c2bcea0
identifier_str_mv 10.1371/journal.pone.0311222.g009
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/27137111
publishDate 2024
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling The Search process of the genetic algorithm.Wenguang Li (6528113)Yan Peng (104995)Ke Peng (2220973)Science PolicyPlant BiologyBiological Sciences not elsewhere classifiedMathematical Sciences not elsewhere classifiedInformation Systems not elsewhere classifiedbody mass indexstacking model basedperforming lightgbm modelmodel &# 8217layer stacking modeldata balance processingrandom forest modelmodel integration strategiesxlink "> diabetesxgboost model optimizedmodel integrationxgboost modelrandom oversamplingdata imbalancesignificant impactscientific basisresults showresearch objectreaching effectspublicly availablepowerful toolparticularly crucialnew ideakaggle platformgenetic algorithmearly interventionearly diagnosisalso providesalso provided<div><p>Diabetes, as an incurable lifelong chronic disease, has profound and far-reaching effects on patients. Given this, early intervention is particularly crucial, as it can not only significantly improve the prognosis of patients but also provide valuable reference information for clinical treatment. This study selected the BRFSS (Behavioral Risk Factor Surveillance System) dataset, which is publicly available on the Kaggle platform, as the research object, aiming to provide a scientific basis for the early diagnosis and treatment of diabetes through advanced machine learning techniques. Firstly, the dataset was balanced using various sampling methods; secondly, a Stacking model based on GA-XGBoost (XGBoost model optimized by genetic algorithm) was constructed for the risk prediction of diabetes; finally, the interpretability of the model was deeply analyzed using Shapley values. The results show: (1) Random oversampling, ADASYN, SMOTE, and SMOTEENN were used for data balance processing, among which SMOTEENN showed better efficiency and effect in dealing with data imbalance. (2) The GA-XGBoost model optimized the hyperparameters of the XGBoost model through a genetic algorithm to improve the model’s predictive accuracy. Combined with the better-performing LightGBM model and random forest model, a two-layer Stacking model was constructed. This model not only outperforms single machine learning models in predictive effect but also provides a new idea and method in the field of model integration. (3) Shapley value analysis identified features that have a significant impact on the prediction of diabetes, such as age and body mass index. This analysis not only enhances the transparency of the model but also provides more precise treatment decision support for doctors and patients. In summary, this study has not only improved the accuracy of predicting the risk of diabetes by adopting advanced machine learning techniques and model integration strategies but also provided a powerful tool for the early diagnosis and personalized treatment of diabetes.</p></div>2024-09-30T17:32:02ZImageFigureinfo:eu-repo/semantics/publishedVersionimage10.1371/journal.pone.0311222.g009https://figshare.com/articles/figure/The_Search_process_of_the_genetic_algorithm_/27137111CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/271371112024-09-30T17:32:02Z
spellingShingle The Search process of the genetic algorithm.
Wenguang Li (6528113)
Science Policy
Plant Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
body mass index
stacking model based
performing lightgbm model
model &# 8217
layer stacking model
data balance processing
random forest model
model integration strategies
xlink "> diabetes
xgboost model optimized
model integration
xgboost model
random oversampling
data imbalance
significant impact
scientific basis
results show
research object
reaching effects
publicly available
powerful tool
particularly crucial
new idea
kaggle platform
genetic algorithm
early intervention
early diagnosis
also provides
also provided
status_str publishedVersion
title The Search process of the genetic algorithm.
title_full The Search process of the genetic algorithm.
title_fullStr The Search process of the genetic algorithm.
title_full_unstemmed The Search process of the genetic algorithm.
title_short The Search process of the genetic algorithm.
title_sort The Search process of the genetic algorithm.
topic Science Policy
Plant Biology
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
body mass index
stacking model based
performing lightgbm model
model &# 8217
layer stacking model
data balance processing
random forest model
model integration strategies
xlink "> diabetes
xgboost model optimized
model integration
xgboost model
random oversampling
data imbalance
significant impact
scientific basis
results show
research object
reaching effects
publicly available
powerful tool
particularly crucial
new idea
kaggle platform
genetic algorithm
early intervention
early diagnosis
also provides
also provided