Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip

<p>Accurate prediction of water inrush volumes is essential for safeguarding tunnel construction operations. This study proposes a method for predicting tunnel water inrush volumes, leveraging the eXtreme Gradient Boosting (XGBoost) model optimized with Bayesian techniques. To maximize the uti...

Full description

Saved in:
Bibliographic Details
Main Author: Shengdong Ju (21178127) (author)
Other Authors: Guangzhao Ou (21178130) (author), Tao Peng (81319) (author), Yanning Wang (5143841) (author), Quanlin Song (21178133) (author), Peng Guan (102746) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1852021045812264960
author Shengdong Ju (21178127)
author2 Guangzhao Ou (21178130)
Tao Peng (81319)
Yanning Wang (5143841)
Quanlin Song (21178133)
Peng Guan (102746)
author2_role author
author
author
author
author
author_facet Shengdong Ju (21178127)
Guangzhao Ou (21178130)
Tao Peng (81319)
Yanning Wang (5143841)
Quanlin Song (21178133)
Peng Guan (102746)
author_role author
dc.creator.none.fl_str_mv Shengdong Ju (21178127)
Guangzhao Ou (21178130)
Tao Peng (81319)
Yanning Wang (5143841)
Quanlin Song (21178133)
Peng Guan (102746)
dc.date.none.fl_str_mv 2025-04-25T05:21:38Z
dc.identifier.none.fl_str_mv 10.3389/feart.2025.1590203.s001
dc.relation.none.fl_str_mv https://figshare.com/articles/dataset/Data_Sheet_1_Tunnel_water_inflow_prediction_using_explainable_machine_learning_and_augmented_partially_missing_dataset_zip/28863425
dc.rights.none.fl_str_mv CC BY 4.0
info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv Solid Earth Sciences
tunnel water inflow
XGBoost
bayesian optimization
data augmentation
model interpretation
dc.title.none.fl_str_mv Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
dc.type.none.fl_str_mv Dataset
info:eu-repo/semantics/publishedVersion
dataset
description <p>Accurate prediction of water inrush volumes is essential for safeguarding tunnel construction operations. This study proposes a method for predicting tunnel water inrush volumes, leveraging the eXtreme Gradient Boosting (XGBoost) model optimized with Bayesian techniques. To maximize the utility of available data, 654 datasets with missing values were imputed and augmented, forming a robust dataset for the training and validation of the Bayesian optimized XGBoost (BO-XGBoost) model. Furthermore, the SHapley Additive explanations (SHAP) method was employed to elucidate the contribution of each input feature to the predictive outcomes. The results indicate that: (1) The constructed BO-XGBoost model exhibited exceptionally high predictive accuracy on the test set, with a root mean square error (RMSE) of 7.5603, mean absolute error (MAE) of 3.2940, mean absolute percentage error (MAPE) of 4.51%, and coefficient of determination (R<sup>2</sup>) of 0.9755; (2) Compared to the predictive performance of support vector mechine (SVR), decision tree (DT), and random forest (RF) models, the BO-XGBoost model demonstrates the highest R<sup>2</sup> values and the smallest prediction error; (3) The input feature importance yielded by SHAP is groundwater level (h) > water-producing characteristics (W) > tunnel burial depth (H) > rock mass quality index (RQD). The proposed BO-XGBoost model exhibited exceptionally high predictive accuracy on the tunnel water inrush volume prediction dataset, thereby aiding managers in making informed decisions to mitigate water inrush risks and ensuring the safe and efficient advancement of tunnel projects.</p>
eu_rights_str_mv openAccess
id Manara_a70dbd2b7c19428058505c6e6e366266
identifier_str_mv 10.3389/feart.2025.1590203.s001
network_acronym_str Manara
network_name_str ManaraRepo
oai_identifier_str oai:figshare.com:article/28863425
publishDate 2025
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv CC BY 4.0
spelling Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zipShengdong Ju (21178127)Guangzhao Ou (21178130)Tao Peng (81319)Yanning Wang (5143841)Quanlin Song (21178133)Peng Guan (102746)Solid Earth Sciencestunnel water inflowXGBoostbayesian optimizationdata augmentationmodel interpretation<p>Accurate prediction of water inrush volumes is essential for safeguarding tunnel construction operations. This study proposes a method for predicting tunnel water inrush volumes, leveraging the eXtreme Gradient Boosting (XGBoost) model optimized with Bayesian techniques. To maximize the utility of available data, 654 datasets with missing values were imputed and augmented, forming a robust dataset for the training and validation of the Bayesian optimized XGBoost (BO-XGBoost) model. Furthermore, the SHapley Additive explanations (SHAP) method was employed to elucidate the contribution of each input feature to the predictive outcomes. The results indicate that: (1) The constructed BO-XGBoost model exhibited exceptionally high predictive accuracy on the test set, with a root mean square error (RMSE) of 7.5603, mean absolute error (MAE) of 3.2940, mean absolute percentage error (MAPE) of 4.51%, and coefficient of determination (R<sup>2</sup>) of 0.9755; (2) Compared to the predictive performance of support vector mechine (SVR), decision tree (DT), and random forest (RF) models, the BO-XGBoost model demonstrates the highest R<sup>2</sup> values and the smallest prediction error; (3) The input feature importance yielded by SHAP is groundwater level (h) > water-producing characteristics (W) > tunnel burial depth (H) > rock mass quality index (RQD). The proposed BO-XGBoost model exhibited exceptionally high predictive accuracy on the tunnel water inrush volume prediction dataset, thereby aiding managers in making informed decisions to mitigate water inrush risks and ensuring the safe and efficient advancement of tunnel projects.</p>2025-04-25T05:21:38ZDatasetinfo:eu-repo/semantics/publishedVersiondataset10.3389/feart.2025.1590203.s001https://figshare.com/articles/dataset/Data_Sheet_1_Tunnel_water_inflow_prediction_using_explainable_machine_learning_and_augmented_partially_missing_dataset_zip/28863425CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/288634252025-04-25T05:21:38Z
spellingShingle Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
Shengdong Ju (21178127)
Solid Earth Sciences
tunnel water inflow
XGBoost
bayesian optimization
data augmentation
model interpretation
status_str publishedVersion
title Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
title_full Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
title_fullStr Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
title_full_unstemmed Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
title_short Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
title_sort Data Sheet 1_Tunnel water inflow prediction using explainable machine learning and augmented partially missing dataset.zip
topic Solid Earth Sciences
tunnel water inflow
XGBoost
bayesian optimization
data augmentation
model interpretation