Poster Poster Program Therapy Physics

Overcoming Computational Bottlenecks In Large-Scale Medical Image Segmentation Using Optimized U-Net

Abstract

Purpose

Training nnU-Net models for medical image segmentation with large patient samples is computationally expensive, limiting iteration speed in research and clinical translation. We present an optimized training workflow that significantly accelerates nnU-Net training on massive datasets without compromising segmentation performance.

Methods

A cohort of 1,539 CT scans was utilized to train nnU-Net (v2) for multi-organ segmentation of 117 structures. The baseline followed the standard nnU-Net training pipeline with default preprocessing, sampling, and training schedule. We implemented an accelerated workflow incorporating four key engineering enhancements: (1) automatic mixed precision training, (2) dynamic batch size scaling to maximize GPU memory utilization, and (3) an efficiency-driven training schedule with early convergence monitoring while maintaining identical input resolution and network configuration. Both the baseline and proposed workflows maintained identical input resolutions and network architectures, utilizing a single NVIDIA L40S GPU. Performance was benchmarked using total GPU training time, Dice Similarity Coefficient (DSC), and 95th-percentile Hausdorff Distance (HD95) on a held-out test set with 89 CT scans.

Results

Baseline training required 97.4 hours total GPU time. The proposed workflow reduced training time to 35.7 hours, achieving a 2.7× speedup (63.3% time reduction). Segmentation accuracy was preserved across all evaluated structures. For instance, mean DSC (baseline vs. efficient) was 0.794 vs. 0.797 (p=0.136) for prostate, 0.925 vs. 0.923 (p=0.133) for brain, and 0.951 vs. 0.955 (p=0.013) for middle lung lobe. Mean HD95 was 4.857 vs. 5.517 (p=0.136) for prostate, 2.781 vs. 3.146 (p=0.180) for brain, and 4.019 vs. 3.658 (p=0.421) for middle lung lobe.

Conclusion

We demonstrated that targeted optimization of the nnU-Net framework significantly reduces training overhead without compromising segmentation fidelity in a large-scale clinical dataset. This streamlined workflow lowers the computational threshold for deep learning in radiation oncology, facilitating rapid model iteration and more efficient deployment of automated segmentation tools in clinical environments.

People

Yunkui PangAuthors · University of North Carolina at Chapel Hill Yirong XuPresenting Author · University of North Carolina at Chapel Hill Pew-Thian Yap, PhDAuthors · University of North Carolina at Chapel Hill Jun Lian, PhDCorrespondings · University of North Carolina at Chapel Hill

Similar sessions

Poster Poster Program

Jul 19 · 07:00

Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD