Abstract In recent years, the incidence of refractory Mycoplasma pneumoniae pneumonia (RMPP) has significantly risen, posing severe pulmonar
Abstract In recent years, the incidence of refractory Mycoplasma pneumoniae pneumonia (RMPP) has significantly risen, posing severe pulmonary and extrapulmonary complications, making early identification a challenge for clinicians. In this retrospective single-center study, we included patients diagnosed with Mycoplasma pneumoniae pneumonia in 2021, categorizing them into RMPP and non-RMPP groups. Univariate regression analysis initially identified variables associated with RMPP. Seven mainstream machine learning methods were then employed to construct predictive models, evaluated for reliability and robustness through tenfold cross-validation and sensitivity analysis. Ultimately, the optimal predictive model was selected using multidimensional metric assessments, and SHAP analysis identified key predictive factors related to RMPP. Twenty-nine factors from various dimensions were found to be associated with RMPP and used to build the predictive model. The XGBoost model demonstrated high predictive capability with an accuracy of 0.80 and an AUC of 0.93. Ten-fold cross-validation and sensitivity analysis confirmed the model’s robustness and reliability. SHAP analysis interpreted the final model with 8 key features. These features include fever duration, macrolide treatment before hospitalization, severe Mycoplasma pneumoniae pneumonia, lactate dehydrogenase, neutrophil-to-lymphocyte ratio, alanine aminotransferase, peak fever, and extensive lung consolidation. This simple, effective predictive model enhances clinicians’ understanding and aids early identification of RMPP.