Assessing the Accuracy of an ML Model for Patient-Specific Quality Assurance Prediction: A Single-Center, Single-Linac Validation Study
Abstract
Purpose
The primary objective of this study was to develop and validate a machine learning (ML) model to predict gamma pass rates (GPR) for patient-specific quality assurance (PSQA), with the ultimate goal of advancing this model into an open-source tool that can support efficient PSQA workflows in limited-resource countries.
Methods
A total of 850 clinical treatment plans covering IMRT, VMAT, SRT, and SBRT were retrospectively evaluated. PSQA was performed using the PTW Octavius phantom with the 1500 detector array and analyzed using Verisoft software for VersaHd Linac. GPR was assessed using 3%/3 mm criteria for IMRT/VMAT and 2%/2 mm criteria for SRT/SBRT. The measured GPR dataset served as the reference. An ML model was trained using plan complexity parameters and delivery characteristics from the same hospital and machine, and subsequently tested to predict GPR values. Agreement between predicted and measured outcomes was used to assess model performance.
Results
Measured PSQA results demonstrated clinically acceptable performance, with mean GPR values exceeding 95% for IMRT/VMAT (3%/3 mm) and 93% for SRT/SBRT (2%/2 mm). The ML model showed strong predictive capability, with the majority of predicted GPR values falling within ±5% of measured results. This high degree of concordance indicates that the model successfully represented delivery characteristics and plan variability within the available dataset.
Conclusion
The ML model demonstrated reliable predictive performance within ±5%. Despite promising outcomes, workload triage and preliminary plan assessment require further validation in multiple linac models and with larger multi-institutional datasets.