Multicenter Distribution Adaptation for Robust CT-Radiomics Differentiation of Pulmonary Tuberculosis and Fungal Pneumonia
Abstract
Purpose
To develop and validate a robust CT-radiomics framework using feature-level multicenter distribution adaptation (MDA) to differentiate pulmonary tuberculosis (TB) from fungal pneumonia (FP), explicitly addressing the challenge of inter-scanner variability and domain shift across institutions.
Methods
A retrospective multicenter study was conducted on 528 patients (317 TB, 211 FP) from four independent centers. Automated segmentation was performed using TotalSegmentator (lung lobes) and a customized nnU-Net (infection lesions). Radiomic features (n=1,781) were extracted from seven ROIs (infection, full lung, and five lobes) following IBSI guidelines. To ensure reproducibility, 24 combinations of supervised and unsupervised feature selection methods were evaluated based on the product of the area under the curve (AUC) and feature stability. A novel MDA framework was implemented to align feature distributions across centers by minimizing marginal and conditional discrepancies. Model performance was evaluated via leave-one-center-out cross-validation and compared against four traditional classifiers (SVM, Random Forest, XGBoost, Decision Tree).
Results
The T-score combined with SPEC emerged as the most robust feature selection strategy. The MDA-enhanced model consistently outperformed traditional classifiers in all external validation scenarios. Specifically, MDA achieved external validation AUCs ranging from 0.880 to 0.914, significantly surpassing the best-performing traditional model (XGBoost, max AUC 0.814; p < 0.05). t-SNE visualization confirmed that MDA effectively removed center-specific clustering, indicating successful domain alignment. SHAP analysis revealed that multi-scale wavelet-transformed texture features were the dominant predictors.
Conclusion
The proposed feature-level MDA framework effectively mitigates domain shifts caused by scanner heterogeneity, significantly improving the generalizability of radiomics models. This approach provides a robust, non-invasive tool for differentiating TB from FP in diverse multicenter clinical settings.