Clinical Evaluation of the Mvision AI Commercial Contouring Solution for Head and Neck (H&N) and Prostate Radiotherapy Treatments
Abstract
Purpose
To evaluate the performance of the MVision AI contouring solution. The geometric accuracy and dosimetric impact of the MVision contours were assessed against the clinically approved contours for head and neck (H&N) and prostate radiotherapy patients.
Methods
A retrospective analysis was performed comparing AI-generated contours (MVision Contour+ V1.2.7) with clinically-approved contours of OARs used for radiotherapy treatment. The study included 15 H&N patients and 10 prostate patients. Geometric accuracy was assessed using a Surface Dice Similarity Coefficient (sDSC) with tolerance (τ) of 3mm, which correlates with manual editing time to make contours clinically acceptable, alongside the 95th percentile Hausdorff Distance (HD95) and Dice Similarity Coefficient (DSC). The dosimetric comparison included the mean dose, maximum dose and clinically relevant dose-volume parameters calculated for the approved treatment plans.
Results
For H&N cases, the greatest sDSC was observed for the eyes (0.99 ± 0.01), mandible (0.98 ± 0.02), and spinal canal (0.91 ± 0.09). Parotid glands demonstrated good agreement (0.86–0.88), while smaller structures such as optic chiasm (0.21 ± 0.15) and larynx (0.23 ± 0.16) showed poorer agreement. For prostate cases, the bladder contours were in excellent agreement (0.97 ± 0.03), followed by rectum (0.92 ± 0.07), seminal vesicles (0.90 ± 0.03), and prostate (0.87 ± 0.05). Correlation analysis revealed sDSC correlated more strongly with HD95 (ρ = −0.80) than DSC (ρ = −0.48), suggesting sDSC better captures boundary discrepancies relevant to manual editing requirements. Mean dosimetric differences were clinically acceptable for most OARs.
Conclusion
MVision AI demonstrates clinically acceptable performance for large, well-defined structures, with mean sDSC exceeding 0.85 for bladder, rectum, parotids, and spinal canal. Smaller structures and those with anatomical variability require manual editing. sDSC may serve as a more clinically relevant metric for evaluating auto-segmentation performance than DSC alone.