Uncertainty Quantification In Neural Network-Based Proton Pencil Beam Dose Prediction
Abstract
Purpose
Deep learning-based dose prediction has emerged as a transformative approach for rapid radiotherapy planning. However, these models typically operate as "black boxes," providing only deterministic predictions without quantifying output reliability. This lack of explainability hinders clinical adoption, as understanding model uncertainty is critical for reliable application. We propose an uncertainty-aware framework that quantifies both epistemic (model) and aleatoric (data) uncertainties for proton pencil-beam dose prediction, enhancing model interpretability and inference accuracy.
Methods
Using CT scans from 27 brain tumor patients, we trained models on beam's-eye view patches from 2,000 randomly sampled pencil-beam configurations per patient. We developed two architectures: (1) a two-headed CNN encoder-decoder with transformer block outputting dose and predictive variance, and (2) a Variational Autoencoder (VAE) predicting latent distribution's mean and log-variance. Both architectures utilize dropouts in every layer. At test time, 20 stochastic forward passes with dropout enabled estimated epistemic uncertainty. Aleatoric uncertainty was derived from predictive variance (CNN) or by decoding 10 Gaussian samples from the latent space (VAE).
Results
Uncertainty-aware models consistently outperformed deterministic baselines. The Beta Negative Log-Likelihood (BNLL) loss and VAE models achieved the lowest Mean Absolute Errors, reducing errors by 33% relative to deterministic versions (from ~0.06% to ~0.04% of Dmax). Moderate correlations between predicted uncertainty and ground truth MC dose errors confirmed that uncertainty scores flag potential inaccuracies. High uncertainty regions aligned with complex tissue interfaces in detailed dose distribution and error map comparisons.
Conclusion
This study demonstrates that incorporating uncertainty quantification into deep learning-based proton dose engines not only provides a measure of confidence but also improves model inference accuracy. These findings pave the way for more transparent and reliable AI-assisted treatment planning in proton therapy.