Development and Preliminary Evaluation of a Deep Learning Pipeline for Target Segmentation In MRI-Guided HDR Brachytherapy
Abstract
Purpose
Cervical cancer remains a major global health burden, with Image-guided High-Dose-Rate Brachytherapy (HDR-BT) serving as a critical treatment modality. While MRI is the gold standard for defining the Gross Tumor Volume (GTV), High-Risk Clinical Target Volume (HR-CTV), and Intermediate-Risk Clinical Target Volume (IR-CTV), manual segmentation is time-intensive and prone to inter-observer variability. Despite the clinical shift toward MRI guidance, automated segmentation tools remain predominantly CT-based. This pilot study evaluates the feasibility of a Convolutional Neural Network (CNN) framework for MRI-based segmentation.
Methods
A pilot cohort of 12 patients with gynecological malignancies were selected. Ground truth contours were manually delineated by radiation oncologists. An automated pipeline using a self-configuring 3D U-Net architecture was developed. To address the nested nature of the target volumes, a hierarchical labeling strategy was implemented.
Results
The model achieved concordance for larger volumes, with a mean Dice Similarity Coefficient (DSC) of 0.69 ± 0.08 for HR-CTV and 0.70 ± 0.08 for IR-CTV. GTV performance was lower (DSC 0.16 ± 0.15), attributed to data saturation where the model correctly identified high-risk regions but conservatively classified the GTV core as HR-CTV due to limited training variance. Surface distance metrics were robust, yielding a mean HD95 of 9.4 ± 4.2 mm for HR-CTV and 11.2 ± 5.0 mm for IR-CTV.
Conclusion
This pilot study confirms the pipeline's stability ahead of an expansion to a ~300-patient dataset, which is expected to resolve GTV generalization constraints. Next steps involve extending the pipeline to a multi-view framework, integrating contours from sagittal and coronal planes to augment axial predictions. A novel multi-modal integration phase utilizing LLMs will also be deployed to encode clinical notes and diagnostic reports. By embedding this contextual data alongside imaging, our aim is to refine predicted contours, specifically the IR-CTV boundary, which relies heavily on pre-treatment diagnostic context.