CBCT Liver and Lesion Segmentation Using CT Fine-Tuned Foundation Model for CBCT-Guided Radiotherapy
Abstract
Purpose
Cone-beam CT (CBCT) is integral to modern radiotherapy workflows; however, limited soft-tissue contrast and imaging artifacts restrict its quantitative use, particularly for online auto-segmentation in CBCT-guided adaptive radiotherapy. Models pretrained on conventional CT often transfer poorly to CBCT due to domain mismatch. This study investigates whether a CT-fine-tuned foundation model (DINOv3) can serve as a robust and generalizable initializer for CBCT segmentation and whether domain-aligned fine-tuning improves downstream performance.
Methods
We propose CBCT-DINOv3, a CBCT segmentation framework initialized from a CT fine-tuned DINOv3 foundation model and further fine-tuned for CBCT-specific tasks. CBCT-DINOv3 was compared against two benchmark approaches: (1) nnU-Netv2 trained from scratch and (2) a DINOv3 model pretrained on natural images. All models were fine-tuned on the CBCTLiTS dataset, which includes 201 CBCT volumes (131 training, 70 testing) of liver and liver lesions synthetically generated from high-quality CT scans acquired across seven clinical sites. During supervised fine-tuning, the encoder was frozen to isolate the effect of initialization. Segmentation performance was evaluated using the Dice similarity coefficient (DSC).
Results
CBCT-DINOv3 consistently achieved the best overall performance on the CBCTLiTS dataset. For liver lesion segmentation, the CT-fine-tuned model achieved mean and median DSC values of 0.421±0.310 and 0.441, respectively, outperforming both natural-image-pretrained DINOv3 (0.404±0.310 / 0.383) and nnU-Netv2 (0.406±0.304 / 0.378). For liver segmentation, all methods demonstrated strong and comparable performance; however, CBCT-DINOv3 showed a modest advantage, achieving mean and median DSC values of 0.939 ± 0.049 and 0.955, compared with 0.935±0.054 and 0.953 (DINOv3) and 0.933±0.058 and 0.953 (nnU-Netv2).
Conclusion
Domain-aligned fine-tuning of DINOv3 on CT significantly improves transfer to CBCT, yielding measurable performance gains, particularly for lesion segmentation, when used as an initialization strategy. These results underscore the importance of domain alignment in foundation-model transfer and demonstrate a practical pathway toward robust CBCT auto-segmentation for adaptive radiotherapy applications.