Poster Poster Program Diagnostic and Interventional Radiology Physics

Synergistic Prediction of Neoadjuvant Immunotherapy Response In Esophageal Cancer Via Multimodal Fusion of Orthogonal CT Latent Features and Vit-Based Histopathology

Abstract
Purpose

Predicting pathological complete response (pCR) following neoadjuvant immunotherapy (nICT) is critical for personalized management of esophageal cancer. This study develops an interpretable multimodal framework that integrates pre-treatment 3D CT latent features, extracted via a Forced-Orthogonal Autoencoder (FOAE), with whole-slide image (WSI) morphological features analyzed through Attention-based Multi-Instance Learning (AMIL).

Methods

We analyzed cohorts of 297 (pathology) and 285 (radiology) patients. For histopathology, WSIs were processed using a tile-based Vision Transformer (ViT-B/16) to extract high-level representations, which were aggregated by an AMIL model using dual-branch attention and dynamic Top-K sampling to generate pCR probabilities. For CT, a 3D convolutional FOAE was designed to learn compact latent representations of esophageal tumor volumes. To ensure feature disentanglement and minimize redundancy, an orthogonality constraint was enforced in the latent space using a cosine-sine reparameterization strategy. A hierarchical stretch operation isolated the 16 high-variance latent features most representative of tumor morphology. A final fusion model integrated these FOAE latent features with AMIL-derived scores, employing Layer-wise Relevance Propagation (LRP) to provide mechanistic interpretability by decomposing slide-level predictions into feature-specific relevance scores.

Results

The AMIL model achieved stable pCR prediction (Validation F1: 0.732), with attention maps highlighting nuclear heterogeneity as a key predictor. The FOAE demonstrated high reconstruction fidelity (Validation MAE: 0.0027; Dice: 0.888), indicating that the learned latent space effectively encoded complex 3D tumor structures. The multimodal fusion model significantly outperformed single-modality predictors, achieving a training AUC of 0.890 and a validation AUC of 0.744. LRP analysis successfully quantified the synergistic contribution of specific orthogonal CT dimensions and histopathological probabilities to the final pCR prediction.

Conclusion

Integrating orthogonal latent features from CT with AI-derived histopathological insights provides superior prognostic value for nICT response. This interpretable framework offers an objective, evidence-based tool for clinical decision-making, facilitating the identification of optimal candidates for surgery or organ-preservation strategies.

People

Related

Similar sessions

Poster Poster Program
Jul 19 · 07:00
B-Trac – Breast Tissue Rotation and Compression Apparatus for Calibration

Mammography (compressed 2D) and MRI (uncompressed 3D) capture breast tissue under different conditions, complicating tumor localization across modalities. To bridge this gap, we developed a customizable physical platform to simul...

Dayadna Hernandez Perez
Diagnostic and Interventional Radiology Physics 0 people interested
Poster Poster Program
Jul 19 · 07:00
Comprehensive Medical Physics Assessment of Digital Mammography Equipment: A Three-Year Multi-Site Evaluation of Technical Performance and Radiation Safety at 24 Saudi Arabian Healthcare Institutions (2022–2024)

To conduct a comprehensive multi-center audit evaluating the technical performance, image quality, and radiation safety of digital mammography systems across 24 unique healthcare facilities in Saudi Arabia. This study aims to est...

Sami Alshaikh, PhD
Diagnostic and Interventional Radiology Physics 0 people interested
Poster Poster Program
Jul 19 · 07:00
Starting Small: Implementing a CT Protocol Optimization Program

This talk describes our organization’s CT optimization program, and how we implemented it to make efficient use of limited physicist time.

Robert J. Cropp, PhD
Diagnostic and Interventional Radiology Physics 0 people interested