Paper Proffered Program Diagnostic and Interventional Radiology Physics

Automated Vision-Language Model -Derived Clinical Descriptors Enhances Radiomic Profiling for Robust Breast Malignancy Prediction

Abstract
Purpose

To enhance breast malignancy prediction, this study develops a multimodal framework that integrates automated, Vision-Language Model (VLM)-derived BI-RADS lexicons with quantitative radiomic features.

Methods

This multi-center study included 889 patients from two institutions, partitioned into training (80%) and independent testing (20%) cohorts. A VLM-driven workflow that utilizing Gemini 3 Pro to simulate expert-level observation was developed. Unlike traditional manual annotation, the VLM analyzed standard dual-view (CC/MLO) mammograms according to the BI-RADS 5th Edition guidelines. It automatically generated qualitative descriptors covering calcification morphology (e.g., fine pleomorphic, amorphous), distribution patterns (e.g., linear, segmental), and architectural distortion. These "digitized clinical observations" were integrated with quantitative radiomic features (shape, first-order, and texture matrices) through a multimodal early-fusion strategy. Following LASSO-based feature selection, an ensemble of ten machine learning classifiers (including Random Forest, XGBoost, and SVM) was trained. Performance was quantified via AUC and 95% confidence intervals (CI) with 1,000 bootstrap resampling iterations in the test cohorts.

Results

The VLM-augmented fusion framework demonstrated superior robustness and accuracy compared to unimodal baselines. The Random Forest classifier achieved the highest efficacy with an AUC of 0.865 (95% CI: 0.802–0.920), significantly outperforming both the radiomics-only model (AUC 0.847) and the lexicon-only model (AUC 0.758). This trend was consistent across other ensemble architectures like XGBoost and LightGBM (AUCs > 0.84). The integration of VLM-derived lexicons also elevated the lower bound of the 95% CI from 0.777 (radiomics-only) to 0.802 (fusion).

Conclusion

This study validates the novel application of VLMs as automated clinical observers in medical imaging. By effectively fusing VLM-derived semantic logic with micro-structural radiomics, the proposed pipeline offers accurate decision-support tool for multimodal breast cancer diagnosis.

People

Related

Similar sessions

Poster Poster Program
Jul 19 · 07:00
B-Trac – Breast Tissue Rotation and Compression Apparatus for Calibration

Mammography (compressed 2D) and MRI (uncompressed 3D) capture breast tissue under different conditions, complicating tumor localization across modalities. To bridge this gap, we developed a customizable physical platform to simul...

Dayadna Hernandez Perez
Diagnostic and Interventional Radiology Physics 0 people interested
Poster Poster Program
Jul 19 · 07:00
Comprehensive Medical Physics Assessment of Digital Mammography Equipment: A Three-Year Multi-Site Evaluation of Technical Performance and Radiation Safety at 24 Saudi Arabian Healthcare Institutions (2022–2024)

To conduct a comprehensive multi-center audit evaluating the technical performance, image quality, and radiation safety of digital mammography systems across 24 unique healthcare facilities in Saudi Arabia. This study aims to est...

Sami Alshaikh, PhD
Diagnostic and Interventional Radiology Physics 0 people interested
Poster Poster Program
Jul 19 · 07:00
Starting Small: Implementing a CT Protocol Optimization Program

This talk describes our organization’s CT optimization program, and how we implemented it to make efficient use of limited physicist time.

Robert J. Cropp, PhD
Diagnostic and Interventional Radiology Physics 0 people interested