Poster Poster Program Therapy Physics

Multi-Scale Deep Learning with Multimodality Data for Postoperative Recurrence Prediction for Cervical Cancer Using MRI: A Multicenter Study

Abstract

Purpose

Cervical cancer (CC) remains one of the most common malignancies in women worldwide, and postoperative recurrence continues to challenge long-term survival. Given that clinical decision-making relies on multimodal information, integrating imaging, textual, and clinical data has the potential to improve predictive performance. Thus, developing a multimodal, multi-scale deep learning(DL) model may enable more precise prognosis prediction in operable CC.

Methods

This multicenter retrospective study included 445 operable CC patients with preoperative MRI and corresponding radiology reports from three institutions. A multi-scale model (MSM) combining ConvNeXt and dual-path Vision Transformer (ViT) was constructed. Textual features were extracted from radiology reports using BERT and fused with imaging and clinical features to form a multimodal network (MSM-TC). The SHapley Additive exPlanations (SHAP) method and attention visualization were employed to enhance model interpretability.

Results

Among single-branch models, ConvNeXt and ViT achieved the best predictive performance with AUCs of 0.798 and 0.766 in the internal validation cohort, and 0.656 and 0.704 in the external validation cohort. The MSM integrating ConvNeXt and ViT improved recurrence prediction, achieving AUCs of 0.944, 0.837, and 0.681 in the training, internal validation, and external validation cohorts, respectively. Incorporating textual information into MSM (MSM-T) further enhanced model performance with AUCs of 0.902, 0.860, and 0.742, while the final multimodal model integrating imaging, textual, and clinical data (MSM-TC) achieved the highest performance with AUCs of 0.930, 0.860, and 0.798 across the three cohorts. Kaplan–Meier analysis confirmed that the model-derived risk score effectively stratified patients into high- and low-risk groups, demonstrating its strong prognostic value.

Conclusion

Our study demonstrates that a multimodal, multi-scale DL framework integrating imaging, textual, and clinical data can achieve robust prediction of recurrence and survival in operable CC patients, highlighting its potential for individualized prognostic assessment and future clinical translation.

People

Xiance Jin, PhDCorrespondings · 1st Affiliated Hospital of Wenzhou Medical University Yiyang WuPresenting Author · 1st Affiliated Hospital of Wenzhou Medical University Yao AiAuthors · 1st Affiliated Hospital of Wenzhou Medical University

Similar sessions

Poster Poster Program

Jul 19 · 07:00

Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD