Poster Poster Program Therapy Physics

Large-Scale Automatic Carbon Ion Treatment Planning for Head and Neck Cancers Via Parallel Multi-Agent Reinforcement Learning

Abstract

Purpose

Head-and-neck cancer (HNC) treatment planning is challenging due to the close proximity of multiple critical organs-at-risk (OARs) to complex target volumes. Intensity-modulated carbon-ion therapy (IMCT) is attractive for HNC due to superior dose conformity and OAR sparing, but its planning process is slow owing to additional modeling requirements such as relative biological effectiveness (RBE). Recent studies have applied deep learning (DL) and reinforcement learning (RL) to automate treatment planning, where DL-based methods often struggle with plan feasibility and optimality due to training data bias, while RL-based methods face challenges in efficiently exploring the large and exponentially complex TPP search space.

Methods

We propose a scalable MARL framework that directly addresses these bottlenecks and enables parallel tuning of 45 TPPs for IMCT. Technically, we adopt a centralized-training decentralized-execution (CTDE) QMIX backbone to stabilize learning in a high-dimensional, non-stationary environment. Additionally, to further improve practicality, we (1) use compact historical DVH vectors as state inputs, (2) introduce a linear action-to-value transformation that maps small discrete actions to uniformly distributed parameter adjustments, and (3) design an absolute, clinically informed piecewise reward aligned to a comprehensive plan scoring system; to improve sample efficiency, a synchronous multi-process data-worker architecture interfaces with the TPS for parallel plan optimization and accelerated data collection.

Results

On a head-and-neck dataset (10 training, 10 testing) the method tuned 45 parameters simultaneously and yielded plans comparable to or better than expert manual plans (relative plan score: RL 85.93±7.85% vs Manual 85.02±6.92%), showing significant (p-value<0.05) improvements for five OARs.

Conclusion

The results demonstrate the capability of the framework to efficiently search for high-dimensional TPPs and produce clinically competitive plans through direct TPS interaction especially for OARs.

People

Jueye ZhangPresenting Author · State Key Laboratory of Nuclear Physics and Technology, Institute of Heavy Ion Physics, Peking University School of Physics Kai-Wen LiAuthors · CAS Ion Medical Technology Co., Ltd. Gen YangAuthors · State Key Laboratory of Nuclear Physics and Technology, Institute of Heavy Ion Physics, Peking University School of Physics Chen LinAuthors · State Key Laboratory of Nuclear Physics and Technology, Institute of Heavy Ion Physics, Peking University School of Physics Yibao ZhangCorrespondings · Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Radiation Oncology, Peking University Cancer Hospital ＆ Institute Chao YangAuthors · Department of Technology, CAS Ion Medical Technology Co., Ltd. Youfang LaiAuthors · Department of Technology, CAS Ion Medical Technology Co., Ltd. Yunzhou XiaAuthors · Department of Technology, CAS Ion Medical Technology Co., Ltd. Haimei ZhangAuthors · Department of Technology, CAS Ion Medical Technology Co., Ltd. Jingjing ZhouAuthors · Department of Technology, CAS Ion Medical Technology Co., Ltd. Wenting YanAuthors · Department of Technology, CAS Ion Medical Technology Co., Ltd.