CHINESE JOURNAL OF PARASITOLOGY AND PARASITIC DISEASES ›› 2024, Vol. 42 ›› Issue (5): 582-593.doi: 10.12140/j.issn.1000-7423.2024.05.004

• ORIGINAL ARTICLES • Previous Articles     Next Articles

Identification of lesion activities in haptic cystic echinococcosis using machine learning model based on radiomics and clinical features

WANG Zhanjin1(), CHEN Zhiheng1, LI Fuyuan1, CAI Junjie1, XUE Zhangtuo1, ZHOU Ying2, CAO Yuntai3, WANG Zhan4,*()   

  1. 1 Clinical Medical School, Qinghai University, Xining 810000, Qinghai, China
    2 Department of Hepatobiliary and Pancreatic Surgery, Qinghai University Affiliated Hospital, Xining 810000, Qinghai, China
    3 Imaging Center, Qinghai University Affiliated Hospital, Xining 810000, Qinghai, China
    4 Department of Medical Engineering and Translational Applications, Qinghai University Affiliated Hospital, Xining 810000, Qinghai, China
  • Received:2024-05-16 Revised:2024-09-04 Online:2024-10-30 Published:2024-10-24
  • Contact: * E-mail: ufofu01@163.com
  • Supported by:
    National Natural Science Foundation of China(82160131);Qinghai Provincial Department of Science and Technology(2021-ZJ-963Q)

Abstract:

Objective To develop machine learning models utilizing radiomic and clinical features to precisely identify the biological activity of haptic cystic echinococcosis (HCE). Methods The CT images and clinical data of 521 HCE patients treated at the Hepatobiliary and Pancreatic Surgery Department of Qinghai University Affiliated Hospital, along with 236 HCE patients treated at the General Surgery Departments of Guoluo Prefectural People’s Hospital and Yushu Prefectural People’s Hospital in 2018-2022, were collected. Radiomics features were extracted and screened accordingly. Univariate and multivariate logistic regression analyses were performed on the clinical data to select features for model construction. To construct radiomics and clinical models, seven machine learning algorithms were employed including Logistic Regression (LR), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest (RandomForest), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Extra Trees. A clinical-image combined model was constructed based on the prediction from radiomics model combining clinical model, using soft voting method. DeLong’s test was used to compare the performances of the radiomics model, clinical model, and combined clinical-imaging model. In addition, external validation was utilized to assess the model’s performance. Results A total of 430 patients were included for model development and training, while 171 patients were designated for external validation. Fifty-one radiomics features and five clinical features were selected for model construction. Among the seven machine learning models, the XGBoost algorithm demonstrated the best performance, achieving area under the curve (AUC) values of 0.977 [95% confidence interval (CI): 0.964-0.990] and 0.839 (95% CI: 0.776-0.901) on the training and external validation sets, respectively. The radiomics model achieved AUC values of 0.998 (95% CI: 0.997-1.000) and 0.874 (95% CI: 0.822-0.927), while the combined model obtained AUC values of 1.000 (95% CI: 0.999-1.000) and 0.931 (95% CI: 0.894-0.968). The DeLong test results indicated that the performance of the combined model was superior to that of the clinical model in the training set (Z = 2.154, P < 0.05) and showed no statistically significant difference when compared to the radiomics model (Z = 0.562, P > 0.05); however, its performance on the external validation set was better than both the clinical and radiomics models (Z = 3.338, 3.331; P < 0.05). Calibration plots and decision curve analysis (DCA) indicated that the combined model exhibited the best calibration performance in both the training and external validation sets, yielding the highest net benefit, demonstrating consistent performance across different datasets, and displaying good generalizability and reliability in external validation. Conclusion The machine learning model, developed based on radiomic and clinical data, can precisely identify the biological activity of HCE lesions. The combined model exhibits higher diagnostic accuracy and clinical application potential, providing reference for making treatment plan for HCE patients.

Key words: Haptic cystic echinococcosis, Lesion activity, Machine learning model, Radiomics, Clinical features

CLC Number: