Image-to-Drug In Glioblastoma: Multi-Sequence MRI Radiomics Coupled with Deep Q-Network Drug Discovery
Abstract
Purpose
To develop an image-to-drug framework for glioblastoma (GBM) that translates multi-sequence MRI radiomics–based survival risk into actionable therapeutic hypotheses, by integrating ensemble survival modeling with multi-omics reinforcement learning.
Methods
Preoperative multi-sequence MRI (T1, contrast-enhanced T1, T2, and FLAIR) was used to extract radiomic features. For each MRI sequence, survival models were independently optimized using machine-learning–based feature selection, including LASSO coefficients and tree-based importance (XGBoost, Random Forest, and CatBoost). Sequence-specific radiomic risk scores were integrated into a Multimodal Ensemble Survival Model (MESM), combining radiomic risks with clinical variables in a single Cox proportional hazards framework. To translate prognostic imaging information into therapeutic hypotheses, radiomic representations were combined with transcriptomic profiles to construct multi-omics patient state vectors. A Deep Q-Network (DQN) reinforcement learning agent was trained to prioritize candidate drugs derived from L1000 connectivity analysis. The action space consisted of top-ranked compounds, and the reward function explicitly modeled an efficacy–toxicity trade-off, defined as expression-signature match score penalized by side-effect severity.
Results
MESM achieved a C-index of 0.72 with significant survival stratification (log-rank p = 0.03), outperforming combined LASSO-Cox (0.61; p = 0.40) and single-model baselines (XGBoost 0.65; CatBoost 0.60; Random Forest 0.60), with performance trends preserved in external inference. In reinforcement learning–based drug prioritization, the BCL-2/BCL-XL inhibitor navitoclax ranked highest, surpassing the standard-of-care agent temozolomide, while metformin, atovaquone, and atorvastatin also demonstrated favorable prioritization as repurposable candidates.
Conclusion
This study introduces a novel image-to-drug paradigm for GBM by coupling multi-sequence MRI radiomics ensemble survival modeling with multi-omics reinforcement learning–based drug prioritization, enabling radiomics-driven, risk-guided therapeutic discovery.