Paper Proffered Program Therapy Physics

A Multimodal Foundation Model for Pediatric Multiparametric MRI Synthesis

Abstract
Purpose

Multiparametric brain MRI is essential for delineating pediatric brain tumor subregions; however, long acquisition times often preclude complete multi-contrast imaging in pediatric patients, and large public pediatric lesion datasets remain limited. We propose a multimodal foundation-model framework that synthesizes missing MRI sequences for pediatric patients from a single input sequence conditioned on acquisition-metadata text prompts and tumor segmentation maps.

Methods

A subset of the BraTS-PEDs dataset (112 subjects; 87/5/20 train/validation/test) containing T1-weighted (T1w), T2-weighted (T2w), FLAIR images, tumor masks, and acquisition metadata (demographics, scanner and field strength, voxel size, and sequence parameters including TR/TE/TI/FA) was used. Starting from a pretrained TUMSyn checkpoint, the model was fine-tuned for 100 epochs across all ordered modality pairs (six synthesis tasks) using balanced sampling, while keeping the pretrained text encoder frozen. Acquisition metadata were encoded as text prompts, and tumor segmentation masks were concatenated as an additional input channel. Performance was evaluated using whole-volume and tumor-region PSNR and SSIM and compared with ablation models (without metadata or segmentation conditioning) and a zero-shot TUMSyn baseline using paired t-tests with Holm correction.

Results

The proposed method achieved mean whole-volume PSNR/SSIM of 21.8dB/0.900 and mean tumor-region PSNR/SSIM of 14.2 dB/0.839 across six synthesis tasks. Removing acquisition-metadata conditioning reduced performance to 20.1 dB/0.885 (tumor region: 12.7 dB/0.821), while zero-shot TUMSyn further degraded performance to 19.1 dB/0.865 (tumor PSNR: 11.9 dB). For the representative T1w→FLAIR task, fine-tuning with metadata achieved PSNR/SSIM of 22.7 dB/0.914 compared with 21.0 dB/0.875 without metadata (p=0.007 / p<0.01).

Conclusion

This work presents a unified, multi-task multimodal foundation model for synthesizing missing pediatric brain tumor MRI sequences. Conditioning on acquisition and demographic metadata, together with segmentation context, improves tumor-region fidelity and may help preserve clinically relevant multiparametric information for tumor delineation and treatment planning when complete multi-sequence acquisitions are not feasible.

People

Related

Similar sessions

Poster Poster Program
Jul 19 · 07:00
Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD
Therapy Physics 0 people interested
Poster Poster Program
Jul 19 · 07:00
User Expectations and Current Availability of HDR Brachytherapy Audits In Europe

The aim of this work was to evaluate the need to implement more dosimetric audits in high‐dose‐rate brachytherapy (HDR-BT) in Europe and to identify which characteristics such audits should meet according to users.

Javier Vijande, PhD Laura Oliver Cañamás
Therapy Physics 0 people interested