Self-Supervised Deep Learning for Automated Segmentation In MR-Guided High Dose-Rate Brachytherapy
Abstract
Purpose
High dose-rate brachytherapy (HDR-BT) is an essential component of cervical cancer treatment. While deep learning has shown promise to automate tasks within HDR-BT, such as segmentation of organs-at-risk (OARs) and targets, quality labelled data is limited. To address this issue, we propose self-supervised learning on unlabelled MRI as a pretraining step. We hypothesize that self-supervised pretraining will result in improvements in segmentation accuracy.
Methods
In this preliminary study that focused solely on OAR segmentation, the pretraining dataset consisted of T2-weighted axial MRI scans (700/300 training/testing scans). A vision image transformer (ViT) was pretrained using masked autoencoding (MAE), where heavily masked images are reconstructed by the model. Following pretraining, fine-tuning on OAR segmentation was conducted on labelled scans (60/20 training/testing scans) using a U-Net style decoder head. This pretrained model was compared against the same architecture with random initialization to establish a baseline.
Results
Following masked autoencoding, reconstruction quality was visibly apparent on a test set (N=300). On a downstream segmentation task (N=20), pretraining significantly improved overall vDSC (0.668 vs. 0.580, p≪0.001), with organ-specific gains for bladder (0.730 vs. 0.591, p=0.003), sigmoid (0.621 vs. 0.500, p<0.001), and bowel (0.530 vs. 0.475, p=0.002), while rectal vDSC showed a non-significant trend (0.793 vs. 0.756, p=0.082). Boundary accuracy (HD95) improved for the bladder (5.22 vs. 8.57, p=0.026), with no statistically significant changes observed for other organs or overall HD95.
Conclusion
Pretraining using MAE improved downstream segmentation accuracy, as indicated by vDSC, but not HD95, compared to random initialization. The limited improvements in HD95 may either be due to ongoing mismatch in the most erroneous portions of organs (e.g. rectum-sigmoid interface) or selective enhancement of global organ localization, rather than boundary. We will expand the pretraining dataset to include wider range of datasets, and will apply this method to other state-of-the-art segmentation pipelines.