What Are We Really Catching with Patient-Specific QA? a Six-Year, 8516-Case Analysis of Detection Efficiency and Site-Specific Variability
Abstract
Purpose
To evaluate temporal and site-specific patterns in phantom-based patient-specific quality assurance (PSQA) performance and estimate failure-capture efficiency using a hypothetical gamma passing rate (GPR) threshold of 90%), a retrospective failure threshold of GPR <95% was hypothetically applied to assess detection efficiency. PSQA reports were extracted via automated Python scripts. Statistical tests included Kruskal–Wallis for temporal GPR variation, chi-square for site-specific failure rates, and Levene’s test for GPR variability. Workload was estimated at 0.3 hours per plan.
Results
Among 8,516 plans, 336 (3.9%) fell below the GPR <95% threshold. GPR differed significantly across years (Kruskal–Wallis H = 132.56, p < 0.001), with the lowest median (97.87%) and highest failure rate (6.1%) in 2023. Failure rates varied significantly by site (Chi-square = 112.22, p < 0.001), with breast (10.3%) and brain (7.0%) plans showing the highest rates and variability (Levene’s test, p < 0.01). A heatmap showed persistent failure clustering in brain, pelvis, and abdomen. Total QA workload was 2,129 hours over six years (~0.21 FTE/year), equating to 7.6 labor hours per flagged case.
Conclusion
Although all plans met clinical acceptance, applying a hypothetical GPR <95% threshold revealed 336 flagged cases (3.9%) over six years, requiring over 2,000 labor hours to detect. Temporal and anatomical variability suggests that risk-adaptive, site-specific QA strategies may improve the balance between safety and efficiency in radiotherapy quality assurance.