Fan Yao,1,2,* Jianliang Miao,3,* Bing Quan,1,2 Jinghuan Li,1,2 Bei Tang,1,2 Shenxin Lu,1,2 Xin Yin1,2 1Liver Cancer Institute, Zhong
Fan Yao,1,2,* Jianliang Miao,3,* Bing Quan,1,2 Jinghuan Li,1,2 Bei Tang,1,2 Shenxin Lu,1,2 Xin Yin1,2 1Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai, People’s Republic of China; 2National Clinical Research Center for Interventional Medicine, Shanghai, People’s Republic of China; 3First Affiliated Hospital of Dalian Medical University, Dalian Medical University, Dalian, People’s Republic of China*These authors contributed equally to this workCorrespondence: Xin Yin, Liver Cancer Institute, Zhongshan Hospital, Fudan University, 136 Yi Xue Yuan Road, Shanghai, People’s Republic of China, Email yin.xin@zs-hospital.sh.cnPurpose: To establish prediction models using Shapley Additive exPlanations (SHAP) and multiple machine learning (ML) algorithms to identify clinical features influencing hepatic arterial infusion chemotherapy (HAIC) resistance and survival in patients with hepatocellular carcinoma (HCC).Patients and Methods: We recruited 286 patients with unresectable HCC who underwent HAIC. Patients were divided into training and validation datasets (7:3 ratio). eXtreme Gradient Boosting (XGBoost) was used to build the preliminary resistance prediction model. The SHAP values explained the importance of the clinical features. Recursive Feature Elimination with Cross-Validation (RFECV) was used to select the optimum number of features. Seven ML methods were used to construct further resistance prediction models, and ten ML algorithms were employed to establish the survival prognosis models.Results: The areas under the curve (AUC) of the XGBoost model were 1.000 and 0.812 for the training and validation groups, respectively. SHAP identified 27 of the 38 clinical features affecting resistance, with pre-HAIC treatment being the main factor. RFECV showed the best model performance with six features (pre-HAIC treatment, tumor size, HBV DNA, alkaline phosphatase (AKP), prothrombin time (PT), and portal vein tumor thrombosis (PVTT)). Random Forest had the best performance among the seven ML algorithms (AUC=0.935 for training, AUC=0.876 for validation). The combination of Stepcox [forward] and Gradient Boosting Machine was the best for predicting survival (AUC=0.98 in training, AUC=0.83 in validation). Based on the above clinical characteristics, patients were categorized into high-risk and low-risk groups based on the median risk score, and it was found that these characteristics also performed well in the prognostic model for predicting the survival of patients with HCC.Conclusion: Pre-HAIC treatment, tumor size, HBV DNA, AKP, PT, and PVTT are effective predictors of post-HAIC resistance and survival in patients with unresectable advanced HCC. Keywords: interpretable AI, treatment response prediction, prognostic modeling, hepatocellular carcinoma outcomes