Poster Poster Program Therapy Physics

Bridging Geometric Accuracy and Clinical Utility In Deep Learning Auto‑Segmentation: The Role of Organ Volume

Abstract

Purpose

To integrate qualitative assessments from experienced treatment planners with quantitative geometric comparisons to evaluate the clinical utility of a deep learning (DL)-based auto-segmentation tool and establish performance criteria grounded in clinical usability.

Methods

RayStation 2024A deep learning auto-segmentation was applied to 50 anonymized planning CTs across five body sites, generating 6–20 organ-at-risk (OAR) contours per case. Ten experienced planners (five dosimetrist–physicist pairs) independently reviewed all contours for their assigned site and provided a clinical utility (CU) score of 0 (unusable), 1 (potentially useful with major edits), or 2 (useful with minor edits). Across all cases, 349 DL-generated contours from 35 organs with corresponding expert contours were quantitatively evaluated using Dice coefficients, Hausdorff distance, and organ volumes. Linear regression models assessed associations between CU score and Dice, volume, and their interaction, with performance evaluated using R², RMSE, and coefficient p-values.

Results

Of 49 organs evaluated, 33 were deemed clinically useful, 12 potentially useful, and 4 unusable, with mean CU scores of 1.85 ± 0.18, 1.59 ± 0.13, and 0.16 ± 0.26, respectively. A Dice-only model showed a statistically significant but weak association with CU (p = 0.021, R² = 0.015). Volume alone was not predictive; however, a Dice*volume interaction markedly improved performance (p = 4.2×10⁻¹⁸, R² = 0.216), reducing RMSE from 0.644 to 0.576. The interaction coefficient (β₃ = 0.98) indicated that Dice became increasingly predictive with larger organ volume. Conversely, Hausdorff distance demonstrated strong predictive power for small volumes (<20 cc; p = 1.7×10⁻¹³, R² = 0.38).

Conclusion

Common geometric metrics alone have limited ability to predict subjective clinical utility. However, incorporating organ volume reveals how data-driven metric combinations may align with planner perception of efficiency. Ongoing work incorporating curvature metrics and comparative DL-versus-clinical dosimetry aims to further refine performance assessment.

People

Jun Lian, PhDCorrespondings · University of North Carolina at Chapel Hill Shiva K. Das, PhDAuthors · University of North Carolina Zachary Gude, PhDPresenting Author · UNC Hospitals, Radiation Oncology

Similar sessions

Poster Poster Program

Jul 19 · 07:00

Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD