BLUE RIBBON POSTER MULTI-DISCIPLINARY: Clinical Implementation of a Knowledge-Based Quality Assurance Tool Using CT and Shape Radiomics for Autosegmentation In Pediatric Craniospinal Irradiation
Abstract
Purpose
Automated segmentation is increasingly integrated into radiotherapy planning workflows; however, ensuring reliable quality assurance (QA) remains challenging, particularly in pediatric patients due to substantial anatomical variability. In this study, we leveraged a large cohort of historical cases to develop a novel knowledge-based QA framework that integrates CT intensity- and shape-based radiomic features within a kernel density estimation (KDE) model. This framework is designed to flag erroneous autosegmentations in pediatric craniospinal irradiation (CSI).
Methods
KDE distributions of CT number intensity and five geometric shape metrics (eccentricity, sphericity, solidity, perimeter, and area) were derived for 16 organs-at-risk, accounting for bilateral structures (brain, brainstem, chiasm, esophagus, eyes, cochleae, kidneys, lenses, lungs, and optic nerves) using clinician-approved contours from 100 pediatric CSI patients (range: 2-25 years; median: 8 years) and served as baseline KDEs. For each auto-generated contour, agreements with organ-specific KDE baseline distributions were quantified, and acceptance thresholds were defined as ±2 standard deviations. Autosegmentations generated by three methods (atlas, commercial AI, and an in-house pediatric-trained AI) were evaluated across 47 patients with 2,256 contours. Segmentation performance between flagged and unflagged contours was compared using Wilcoxon rank-sum test.
Results
Across all organs, contours flagged by the proposed QA framework revealed significantly lower Dice similarity coefficients, sensitivity, accuracy, and precision (p < 0.001), while specificity did not differ significantly. Shape-based QA identified 132 contours with geometric inconsistencies, 99 of which were not detected by CT-based QA alone. Incorporating shape QA increased the total flagging rate from 13.6% (CT-based QA only) to 17.8%, improving detection of suboptimal segmentations.
Conclusion
Distinct radiomic patterns from historical pediatric CSI cases enabled the development of a mathematical KDE model integrating CT intensity- and shape-based analyses. Together, these methods form a quantitative, interpretable, and vendor-neutral QA framework for pediatric CSI that facilitates reliable detection of gross autosegmentation errors.