From Preference Learning to Clinical Insight: Interpreting Expert Treatment Plan Evaluation
Abstract
Purpose
Objective assessment of radiotherapy plans is challenging because expert assessment relies on complex, multidimensional tradeoffs that are not fully captured by predefined dose-volume constraints. This study aims to quantitatively interpret expert treatment plan evaluation to improve transparency and reproducibility.
Methods
We studied a planner preference-aligned plan evaluation framework, which learns from expert-labeled and LLM-generated pairwise plan preferences based on clinically relevant dose-volume metrics. The model was trained using 1,240 locally advanced non-small cell lung cancer plan pairs. To interpret the learned plan score function, three analyses were performed. First, we perturbed individual dosimetric features while keeping the plan score constant to quantify tradeoffs between competing objectives. Second, SHAP-based feature importance was compared for 35 expert-reviewed plan pairs. Plans with unanimous expert preference and plans with inter-expert disagreement were compared to determine the sources of ambiguity. Third, feature ranks were assessed across all plans to evaluate the consistency of objective priority across treatment plans.
Results
The tradeoff analysis showed approximately linear relationships between competing objectives, with a 1% reduction in lung V20Gy offset by a 3.9 ± 1.5 Gy increase in mean heart dose (R2=0.76), or 4.6 ± 3.5 Gy in mean esophagus dose (R2=0.59). Disagreement cases exhibited similar feature dominance to consensus cases (p = 0.38, Wilcoxon rank-sum). This indicates that expert disagreements are primarily due to balanced multi-objective tradeoffs rather than conflicting interpretations of a single metric. Feature rank analysis demonstrated consistent prioritization of lung V20Gy (ranked first or second in 88% cases).
Conclusion
This study provides insight into clinically meaningful tradeoffs and plan evaluation priorities. The expert preference-based plan evaluation method supports transparent plan evaluation and establishes a foundation for an interpretable and automated treatment planning system.