Evaluation of Hand-Crafted Radiomics and Foundational Models to Predict Esophagitis In Patients with Locally Advanced Non-Small Cell Lung Cancer on the Nrg Oncology Trial RTOG 0617
Abstract
Purpose
We implemented a hand-crafted radiomics model and evaluated it against a foundational model, Med3D, a popular transfer network model pretrained on a large-scale medical imaging dataset, for prediction of radiation-associated esophagitis in patients with locally advanced Non-Small Cell Lung Cancer (LA-NSCLC).
Methods
The NRG Oncology/Radiation Therapy Oncology Group (RTOG) 0617 dataset was utilized. We implemented handcrafted radiomics and a Med3D transfer network without fine-tuning to extract features from CT and 3D dose maps (dosiomics) using esophagus contours to predict for grade ≥2 esophagitis. Receiver operating characteristic (ROC) and correlation analyses were performed to identify significant predictors of esophagitis. LASSO, Random Forest (RF), XGBoost and Support Vector Machine (SVM) were trained on 315 patients using 10-fold nested cross-validation (CV). Independent predictors were ranked on importance score, and stepwise-forward feature selection was used to select the subset that minimized validation error. The best performing model (highest AUC) was applied to 136 previously unseen test patients.
Results
Grade ≥2 esophagitis occurred in 40% of patients (178/451). Models based solely on clinical features did not demonstrate predictive performance. The highest performing model (LASSO) used a combination of handcrafted radiomics and dosiomics features with an AUC of 0.70 for both validation and test datasets. The Med3D feature-based model (Random Forest) achieved an AUC of 0.75 and 0.62 on the validation and test datasets, respectively, suggesting an increased generalization error with this model.
Conclusion
Results are suggestive that the Med3D foundational network combined with hand-crafted radiomics and dosiomics may serve as a signature for prediction of post-radiotherapy esophagitis. The increased generalization error with the Med3D model suggests that better training/fine-tuning of the contextual transfer learning with larger sample sizes is warranted.