Ensemble Learning of Foundation Models for Precision Oncology
Abstract
Purpose
Histopathology is essential for disease diagnosis and treatment decision-making. Recent advances in artificial intelligence (AI) have enabled the development of pathology foundation models that learn rich visual representations from large-scale whole-slide images (WSIs). However, existing models are often trained on disparate datasets using varying strategies, leading to inconsistent performance and limited generalizability. Here, we introduce ELF (Ensemble Learning of Foundation models), a novel framework that integrates five state-of-the-art pathology foundation models to generate unified slide-level representations.
Methods
We trained ELF on 53,699 WSIs spanning 20 anatomical sites, by leveraging ensemble learning to capture complementary information from diverse models while maintaining high data efficiency. Unlike traditional tile-level models, ELF's slide-level architecture is particularly advantageous in clinical contexts where data are limited, such as therapeutic response prediction. We evaluated ELF across a wide range of clinical applications, including classification and subtyping of 4,697 tumor samples, 88 combinations of biomarker-indication assessment in 11,530 patients, and therapeutic response prediction in 21 independent cohorts of 1,592 patients across 9 cancer types.
Results
ELF consistently outperformed individual foundation models as well as existing slide-level approaches across most of the clinical tasks. When evaluated on detecting 28 clinically actionable genetic alterations across 13 cancer types, ELF surpassed state-of-the-art foundation models with AUC exceeding 0.85 for specific gene mutations such as BRAF in thyroid carcinoma and KRAS in lung adenocarcinoma. Compared with existing slide-level models, ELF improved the AUC by 9-14% for predicting anti-cancer therapy response; and by 10-17% for predicting immunotherapy response across all evaluation datasets.
Conclusion
This study introduces a new paradigm for building foundation models, one that shifts the focus from constructing ever-larger models to strategically integrating the complementary strengths of existing pretrained models. Our ensemble learning-based approach enables the creation of robust and generalizable representations without requiring access to proprietary training data.