From Voxels to Mutations: Stability-Aware Multi-Target CT Imaging Prediction of EGFR and KRAS Mutations
Abstract
Purpose
To address the limited robustness of existing CT-based radiogenomic models, this study develops a multicenter framework for non-invasive dual prediction of EGFR and KRAS gene mutations in personalized management of non-small cell lung cancer (NSCLC), comparing handcrafted-radiomic (HRF)/deep-radiomic features (DRF, extracted from pre-trained deep-learning networks), and combined radiomic-features to improve generalizability across heterogeneous datasets.
Methods
We conducted a retrospective multicenter radiogenomic study of 1,082 patients with NSCLC from 12-public-CT cohorts, with genomic-mutation data available for 136 cases. Tumors were segmented by two physicians and validated by an expert. 496 IBSI-compliant HRFs and 511 DRFs were extracted using the PySERA framework. Both supervised (SL) and semi-supervised learning (SSL) approaches were evaluated. In SL, 27 feature selection and 29 attribute extraction algorithms were paired with 29 classifiers, and assessed using five-fold cross-validation on labeled data from one center (NSCLC-Radiogenomics, n=123), with external-nested testing performed on an independent cohort (TCGA-LUAD, n=13). In SSL, missing mutation labels from 10 additional centers (n=946) were pseudo-labeled using logistic-regression and incorporated into the training set within each fold, while unlabeled cases were excluded from validation and testing. Model selection used a stability-aware composite score combining mean performance and variability of balanced accuracy, recall, precision, F1-score, and ROC-AUC, prioritizing reproducible and generalizable classifiers while penalizing instability.
Results
HRF-based models demonstrated superior robustness and generalizability under SSL. Random Forest-based feature importance linked with Histogram Gradient Boosting, achieving cross-validation ROC-AUC=0.93±0.02 and accuracy=0.88±0.01, as well as external testing ROC-AUC and accuracy=0.77±0.07 and 0.77±0.00. DRF-based models or a mixture of DRF and HRFs showed high cross-validation performance (~0.97) but markedly reduced external AUC (~60%). Across all algorithms, SSL significantly outperformed SL (P<0.05).
Conclusion
Standardized HRFs, combined with machine learning within an SSL framework, provide a robust and generalizable approach for non-invasive dual prediction of EGFR/KRAS mutations in NSCLC across multicenter datasets.