Research Article: Development and application of machine learning models for hematological disease diagnosis using routine laboratory parameters: a user-friendly diagnostic platform
Abstract:
            In recent years, with the change of social environment, the incidence and detection rate of hematological diseases have shown an increasing trend. Early diagnosis and detection of hematological diseases are very important to improve the quality of life and prognosis of patients.
In this study, we employed 54 clinical and conventional laboratory parameters. By optimally combining multiple feature selection methods and machine learning algorithms, we developed 7 machine learning models with varying feature set sizes. We comprehensively evaluated the performance of these models, analyzed the interpretability of the optimal and simplified models using SHapley Additive exPlanations (SHAP), and compared these two models with the diagnostic performance of hematologists. Finally, we developed a user-friendly diagnostic platform.
The results showed that the ensemble model_1 with 46 feature parameters (EnMod1-46) and the simple ensemble model_2 with 12 feature parameters (EnMod2-12) demonstrated significant performance in diagnosing 16 types of hematological diseases. On the temporally distinct test set_1, the EnMod1-46 achieved an accuracy of 0.804 and an area under the curve (AUC) of 0.964, while EnMod2-12 attained an accuracy of 0.784 and an AUC of 0.961. To further validate the model’s generalization performance, EnMod1-46 achieved an accuracy of 0.738 and an AUC of 0.973 on the independent external test set_2, while EnMod2-12 yielded an accuracy of 0.705 and an AUC of 0.962. SHAP analysis showed that PLT, WBC, MCV, HGB, RBC and age were significant parameters in both models. Comparative analysis of clinical diagnosis revealed that the performance of EnMod1-46 and EnMod2-12 outperformed junior hematologists, while EnMod1-46 was comparable to senior hematologists. Concurrently, based on EnMod2-12, we developed a user-friendly diagnostic platform to facilitate risk assessment and improve access to accurate diagnosis.
This study provides an efficient and accurate screening method for hematological diseases, especially in resource-limited countries and regions.          
Introduction:
							In recent years, with the change of social environment, the incidence and detection rate of hematological diseases have shown an increasing trend. Early diagnosis and detection of hematological diseases are very important to improve the quality of life and prognosis of patients.				
				Read more