Interactive Personalized AI for Physician-In-the-Loop 3D Tumor Segmentation on CT
Abstract
Purpose
To develop and evaluate a personalized physician-in-the-loop (PPitL) AI framework for accurate and efficient CT tumor segmentation through iterative clinician feedback.
Methods
Our study included a multi-organ CT tumor segmentation dataset encompassing 825 patients, stratified by cancer type: colon (n=126), pancreas (n=281), liver (n=118), and kidney (n=300). Data were obtained from public segmentation challenges: MSD (colon/pancreas; Memorial Sloan Kettering Cancer Center), LiTS (liver; multi-center), and KiTS21 (kidney; M Health Fairview and Cleveland Clinic). The proposed model builds on a recent foundation model (MedSAM), where the pretrained image encoder was kept frozen and new components were trained to support personalization. Specifically, we introduced personalization through cascaded self-attention and cross-attention modules. These components capture the relationship between each clinician’s current correction and prior segmentation outputs, enabling the model to anticipate and adapt to the clinician’s refinement style. Clinician feedback could be provided through points, scribbles, or bounding boxes. At each step, the system generated a candidate segmentation mask and applied a lightweight CNN-based refinement after clinician review and confirmation. Training was performed using a five-fold cross-validation strategy. A composite loss function (Dice + cross-entropy) was used to optimize both overlap and boundary accuracy while improving iterative corrections. Performance was evaluated using the Dice similarity coefficient (DSC) and normalized surface Dice (NSD).
Results
The model outperformed recent state-of-the-art methods and achieved near-expert performance after 10 iterations of clinician feedback: mean DSC/NSD of 0.947±0.03/0.982±0.02 (colon tumors), 0.955±0.02/0.936±0.03 (pancreas tumors), 0.948±0.02/0.951±0.01 (liver tumors), and 0.972±0.04/0.988±0.04 (kidney tumors). Accuracy improved steadily with each iteration, with major corrections typically completed within the first 5–6 interactions.
Conclusion
The proposed personalized, PPitL AI segmentation tool provides fast, reliable tumor delineation in CT imaging. By adapting to each clinician’s correction behavior, it builds trust, reduces variability, and minimizes time burden while achieving expert-level accuracy.