Multiparametric Machine Learning Model Selection for Improved Survival Prediction In Lung Cancer: Using Semi-Supervised Learning and PET-CT Fusion Tensor Radiomics
Abstract
Purpose
To apply a multiparametric model selection strategy within a tensor radiomics paradigm, whereby different flavours of radiomics features are generated from multiple PET-CT image fusion strategies, to identify reliable and generalizable machine learning (ML) models for lung cancer outcome prediction.
Methods
We assembled a lung cancer cohort of 693 patients (581 training from multiple centers; 112 external center testing) and extracted radiomics features using PySERA applied to PET, CT, and fused PET-CT images fused with different parameters (weighted, wavelet, and PCA methods) to create 12 flavours. ComBat harmonization was applied to reduce center-related variability. Radiomics features were first evaluated using the intraclass correlation coefficient (ICC), and only robust features (ICC≥0.6) along with clinical features (e.g. age, smoking status) were retained. The selected features were used as inputs for ML model selection, which included 29 classifiers and 56 dimensionality reduction techniques. We considered performance of individual flavours, selected and combined flavours through a tensor paradigm, and clinical features alone during model training. Model training employed a 5-fold semi-supervised approach (257 labeled, 324 pseudo-labeled) for 2-year binary event-free survival prediction, where models were trained on labeled data, augmented with pseudo-labeled cases, validated per fold, and tested on an external cohort. Model selection utilized a composite performance-stability score out of 2, where the mean±standard deviation of balanced accuracy, recall, precision, F1-score, and AUC across cross-validation folds were normalized and combined.
Results
The top-performing pipelines included Autoencoder for dimensionality reduction and MLP classifier (score 1.62), achieving balanced accuracy=0.77±0.03, F1=0.78±0.03, precision=0.78±0.03, recall=0.78±0.03, and AUC=0.86±0.03 averaged across all single flavour and tensor paradigms, with performance confirmed on the external test set.
Conclusion
These findings highlight that multiparametric model selection within a tensor radiomics paradigm enables identification of reliable and generalizable ML models for lung cancer outcome prediction.