Planningcopilot: A Multi-Agent LLM Framework Integrating Pre-Compiled Esapi Executables for Autonomous Planning Optimization In Locally Advanced NSCLC
Abstract
Purpose
Consistently automating clinically acceptable plans without human intervention remains a challenge in radiotherapy. While knowledge-based planning (KBP) predicts optimal achievable dose-volume metrics, it often fails to achieve these metrics without manual adjustments. We introduce PlanningCopilot, a system integrating a validated ESAPI optimization module ("PlanAct") with a multi-agent large language model (LLM) framework. This study evaluates the system's ability to autonomously generate clinically acceptable plans for locally advanced non-small cell lung cancer (LA-NSCLC) and assesses its potential to refine performance by self-learning.
Methods
We developed PlanningCopilot comprising four specialized GPT-4.1 agents that iteratively drive the treatment planning system, including (1) an Evaluator parsing dose distributions against clinical constraints, (2) a Supervisor validating reasoning, (3) a Planner executing tasks via the PlanAct API which operationalizes human strategies for KBP-based initialization, organ-at-risk dose suppression, and hotspot management, and (4) an optional Learner synthesizing optimization history into knowledge. We retrospectively analyzed 62 conventionally fractionated LA-NSCLC cases and compared original clinical plans with autonomous plans with and without the Learner’s knowledge.
Results
All autonomous plans fully achieve clinical dosimetric requirements, including those not achieved in the clinical plans. Paired Wilcoxon signed-rank tests showed no significant differences between autonomous and clinical plans for Lung Dmean (p = 0.447), Lung V5Gy (p = 0.287), Lung V20Gy (p = 0.522), Heart Dmean (p = 0.637), Spinal Cord D0.03cc (p = 0.456), and Plan D0.03cc (p = 0.392). Furthermore, autonomous plans achieved significantly lower Esophagus D0.03cc (p = 0.027). In a subset of 18 cases requiring at least two iterations, applying Learner-summarized knowledge reduced required iterations by 11.8% on average, while maintaining similar dosimetric quality (p > 0.05).
Conclusion
PlanningCopilot enables clinically acceptable autonomous planning for LA-NSCLC. The system consistently meets clinical dosimetric requirements across varying anatomical complexities. Furthermore, PlanningCopilot improved optimization efficiency via self-learning from optimization history.