Research Article: Development and validation of an interpretable machine learning model for acute radiation dermatitis in breast cancer
Abstract:
Radiation dermatitis (RD), a common adverse reaction in breast cancer radiotherapy, impairs quality of life and increases healthcare burdens. Developing an effective risk prediction model is crucial for early high-risk patient identification and preventive interventions.
This study enrolled 691 breast cancer patients undergoing postoperative radiotherapy at our center from February 1 to December 19, 2024. RD severity and correlates were monitored during and 2 weeks after radiotherapy. The dataset was divided into training (n=552) and test (n=139) cohorts. Fourteen machine learning algorithms were evaluated via 10-fold cross-validation, with model selection based on Area Under the Curve (AUC) and other metrics. Model reliability was verified using an internal hold-out test set, and SHAP analysis ensured interpretability.
Among 691 patients,52.68% (n=364) developed grade ?2 acute RD. The random forest model performed best, achieving an AUC of 0.84 (95% CI: 0.807–0.873) in training and 0.748 (0.665–0.831) in testing, with training/testing sensitivity/specificity of 0.811/0.747 and 0.877/0.576, respectively. Calibration curves confirmed prediction-observation consistency. Decision curve analysis indicated 0.2–0.4 higher net benefits than “treat-all” or “treat-none” strategies at 25%–75% treatment thresholds. Shapley Additive exPlanations (SHAP) analysis identified Clinical Target Volume-Supraclavicular (CTVsc), Clinical Target Volume-Internal Mammary (CTVim), TNM stage II, and diabetic status as key predictors.
This explainable machine learning model demonstrates robust discriminative power and clinical utility. Interpretability analysis revealed feature nonlinearities, providing a theoretical basis for personalized radiotherapy planning to reduce severe RD risk.
Introduction:
Radiation dermatitis (RD), a common adverse reaction in breast cancer radiotherapy, impairs quality of life and increases healthcare burdens. Developing an effective risk prediction model is crucial for early high-risk patient identification and preventive interventions.
Read more