Table 1_An interpretable machine learning model for early prediction of Escherichia coli infection in ICU patients.docx

Background<p>Early and accurate identification of Escherichia coli (E. coli) infection in intensive care unit (ICU) patients remains challenging butmay improve clinical outcomes if addressed effectively. This study aimed to develop and validate an interpretable machine learning model for early...

Full description

Saved in:
Bibliographic Details
Main Author: Shu Yang (381226) (author)
Other Authors: Laiyu Zou (22672274) (author), Huixin Liang (5344523) (author), Xiaohong Xu (180651) (author), Xiaoling Chen (679181) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background<p>Early and accurate identification of Escherichia coli (E. coli) infection in intensive care unit (ICU) patients remains challenging butmay improve clinical outcomes if addressed effectively. This study aimed to develop and validate an interpretable machine learning model for early prediction of E. coli infection at ICU admission.</p>Methods<p>This retrospective study was conducted using the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. Adult patients (aged 18–100 years) with their first ICU admission and a length of stay ≥24 hours were included. E. coli infection was identified based on microbiological results and diagnostic codes. Missing data were imputed using the missForest algorithm. Feature selection was performed with Boruta and least absolute shrinkage and selection operator (LASSO), and intersecting variables were used for model construction. Eight machine learning models, logistic regression, k-nearest neighbors, decision tree, random forest, extreme gradient boosting, light gradient boosting machine, support vector machine (SVM), and neural network, were developed. Model performance in the validation cohort was assessed using area under the receiver operating characteristic curve (AUC) with 95% confidence interval (CI), sensitivity, specificity, F1 score, calibration curves, decision curve analysis (DCA), and clinical impact curves (CIC). Model interpretability was evaluated with Shapley additive explanations (SHAP).</p>Results<p>A total of 52, 554 ICU patients were analyzed, of whom 4, 157 (7.9%) had E. coli infection. Twenty-eight intersecting variables were selected for modeling. Among all models, the SVM achieved the highest discrimination (AUC = 0.745, 95% CI: 0.726-0.764), followed by random forest (AUC = 0.742) and extreme gradient boosting (AUC = 0.739). Calibration and decision analyses indicated robust model calibration and clinical utility. SHAP analysis identified gender, age, sepsis, sedative use, and potassium level as the most influential predictors. A web-based tool was developed to enable real-time clinical risk estimation and individualized interpretability.</p>Conclusions<p>An interpretable SVM-based machine learning model was developed and validated for early prediction of E. coli infection in ICU patients, demonstrating good discrimination, calibration, and potential clinical benefit. The associated online tool provides transparent, individualized risk predictions and may facilitate timely clinical decision-making.</p>