Research Article: Hypoxemia prediction model based on XGBoost during sedation for gastrointestinal endoscopy
Abstract:
Hypoxemia is the most common complication of sedated gastrointestinal endoscopy and can lead to serious consequences. Predicting and preventing hypoxemia remains challenging. Accurate prediction using integrated clinical data and artificial intelligence shows great potential. This study aimed to develop a robust, interpretable, and generalizable Machine Learning (ML) model with acceptable performance for predicting hypoxemia during sedated gastrointestinal endoscopy.
This prospective study included 647 adult patients who underwent sedated gastrointestinal endoscopy at Shanghai Sixth People's Hospital, affiliated with Shanghai Jiao Tong University School of Medicine, between January and May 2025. We employed a combination of statistical and ML techniques, including Pearson correlation analysis, T -test, Chi-square test, Levene test, SHapley Additive exPlanations (SHAP) values, and eXtreme Gradient Boosting (XGBoost) feature importance metrics, for feature selection. Prediction models were developed using XGBoost algorithms, and its performance was evaluated using Accuracy, Precision, Recall, F1-score, and Receiver Operating Characteristic Area Under the Curve (ROC–AUC). After identifying the optimal model, a hypoxemia prediction model was established and validated. We also analyzed the performance of combined features to create innovative features.
The XGBoost model demonstrated the best performance, achieving an accuracy, recall, and F1-score of 0.91 and an ROC–AUC of 0.74 using the selected features. Feature importance analysis identified 29 key features, including 26 traditional features and three innovative features introduced in this study, where Body Mass Index (BMI), waist circumference, neck circumference, age, baseline SpO 2 contribute most significantly. Model performance improved when applied to a more balanced dataset of 647 samples, underscoring the importance of sample size in model accuracy.
We present a robust XGBoost-based hypoxemia prediction model that can help clinicians identify at-risk patients during sedated gastrointestinal endoscopy. The model's performance highlights the potential of artificial intelligence to enhance patient safety and clinical decision-making. Future studies should focus on refining the model using larger and more diverse datasets to improve predictive accuracy and clinical applicability. Additionally, methods such as latent-space analysis will be explored to address class imbalance.
Introduction:
Hypoxemia is the most common complication of sedated gastrointestinal endoscopy and can lead to serious consequences. Predicting and preventing hypoxemia remains challenging. Accurate prediction using integrated clinical data and artificial intelligence shows great potential. This study aimed to develop a robust, interpretable, and generalizable Machine Learning (ML) model with acceptable performance for predicting hypoxemia during sedated gastrointestinal endoscopy.
Read more