Poster Poster Program Therapy Physics

Multi-Task Machine Learning with Interpretability for Predicting the Imaging and Radiation Oncology Core Head and Neck Phantom Outcomes

Abstract

Purpose

The Imaging and Radiation Oncology Core (IROC) IMRT head and neck (HN) phantom continues to show unacceptable delivery rates at ~10%. Increased understanding of what drives performance on this audit would assist in targeted improvement of radiotherapy quality. Machine learning is a promising tool for interpreting phantom results. A Multi-Task Learning (MTL) framework (versus traditional single-task models) can investigate intrinsic correlations, treating target, organ-at-risk (OAR), and 2D film dosimetry as interconnected tasks, to enhance the detection of unacceptable plans.

Methods

We analyzed 1,447 IROC HN phantom irradiations (101 failures) performed by over 1,000 institutions between 2012 and 2020. The feature set included plan complexity metrics, TPS parameters, dosiomics, and DVH metrics. We developed a chain-based multi-task architecture using XGBoost (MT-XGB) and Random Forest (MT-RF) to jointly learn regression tasks (predicting Average Gamma, primary/secondary PTV, and OAR TLD ratios) and binary classification (Pass/Fail). Models were evaluated using bootstrap-voting feature selection, random oversampling, and 20×5 repeated stratified k-fold cross-validation. SHAP partial dependence plots were employed to interpret non-linear relationships between features and outcomes.

Results

Overall, the MTL framework demonstrated superior or comparable performance to ST models. MT-XGB achieved the highest classification performance with an F1 Score of 0.85 and Accuracy of 0.75, surpassing ST-XGB (Accuracy: 0.73) and ST-RF (Accuracy: 0.66). Notably, MT-XGB yielded the highest Sensitivity (0.76), identifying significantly more failing plans than ST models. SHAP analysis revealed that aperture-based complexity metrics, specifically Plan Irregularity, mean MLC Speed, and mean MLC Gap, were the dominant drivers of failure, suggesting that modulation complexity may impact the interplay between target coverage and OAR sparing.

Conclusion

Plan complexity remains the primary determinant of IMRT HN delivery accuracy. By leveraging interconnected endpoints, multi-task learning provides a robust and effective tool for predicting clinical outcomes and improving credentialing workflows.

People

Hunter S. Mehrens, PhDAuthors · The University of Texas MD Anderson Cancer Center Lian DuanPresenting Author · The University of Texas MD Anderson Cancer Center Panettieri Venessa, PhDAuthors · Peter MacCallum Cancer Centre Clifton David Fuller, PhDAuthors · Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center; The University of Texas MD Anderson Cancer Center UTHealth Houston Graduate School of Biomedical Sciences Stephen F. Kry, PhDAuthors · Imaging and Radiation Oncology Core (IROC) Houston Quality Assurance Center, The University of Texas MD Anderson Cancer Center Paige A. Taylor, PhD, MSAuthors · Imaging and Radiation Oncology Core (IROC) Houston Quality Assurance Center, The University of Texas MD Anderson Cancer Center Christine Peterson, PhDAuthors · Department of Statistics, Rice University