Image 2_Development and application of machine learning models for hematological disease diagnosis using routine laboratory parameters: a user-friendly diagnostic platform.jpeg

Aim<p>In recent years, with the change of social environment, the incidence and detection rate of hematological diseases have shown an increasing trend. Early diagnosis and detection of hematological diseases are very important to improve the quality of life and prognosis of patients.</p>...

Full description

Saved in:
Bibliographic Details
Main Author: Jingya Liu (13338460) (author)
Other Authors: Yang Gou (22346113) (author), Wuchen Yang (11606577) (author), Hao Wang (39217) (author), Jing Zhang (23775) (author), Shengwang Wu (22346116) (author), Siheng Liu (22346119) (author), Tinglu Tao (22346122) (author), Yongjie Tang (11828551) (author), Cheng Yang (273624) (author), Siyin Chen (11638942) (author), Ping Wang (42415) (author), Yimei Feng (6383840) (author), Cheng Zhang (70708) (author), Shuiqing Liu (2826422) (author), Xiangui Peng (9138497) (author), Xi Zhang (83736) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Aim<p>In recent years, with the change of social environment, the incidence and detection rate of hematological diseases have shown an increasing trend. Early diagnosis and detection of hematological diseases are very important to improve the quality of life and prognosis of patients.</p>Methods<p>In this study, we employed 54 clinical and conventional laboratory parameters. By optimally combining multiple feature selection methods and machine learning algorithms, we developed 7 machine learning models with varying feature set sizes. We comprehensively evaluated the performance of these models, analyzed the interpretability of the optimal and simplified models using SHapley Additive exPlanations (SHAP), and compared these two models with the diagnostic performance of hematologists. Finally, we developed a user-friendly diagnostic platform.</p>Results<p>The results showed that the ensemble model_1 with 46 feature parameters (EnMod1-46) and the simple ensemble model_2 with 12 feature parameters (EnMod2-12) demonstrated significant performance in diagnosing 16 types of hematological diseases. On the temporally distinct test set_1, the EnMod1-46 achieved an accuracy of 0.804 and an area under the curve (AUC) of 0.964, while EnMod2-12 attained an accuracy of 0.784 and an AUC of 0.961. To further validate the model’s generalization performance, EnMod1-46 achieved an accuracy of 0.738 and an AUC of 0.973 on the independent external test set_2, while EnMod2-12 yielded an accuracy of 0.705 and an AUC of 0.962. SHAP analysis showed that PLT, WBC, MCV, HGB, RBC and age were significant parameters in both models. Comparative analysis of clinical diagnosis revealed that the performance of EnMod1-46 and EnMod2-12 outperformed junior hematologists, while EnMod1-46 was comparable to senior hematologists. Concurrently, based on EnMod2-12, we developed a user-friendly diagnostic platform to facilitate risk assessment and improve access to accurate diagnosis.</p>Conclusion<p>This study provides an efficient and accurate screening method for hematological diseases, especially in resource-limited countries and regions.</p>