Augmenting Patient Safety Surveillance In Radiation Oncology with Large Language Model-Based Root Cause Analysis
Abstract
Purpose
To evaluate the reasoning capabilities of large language models (LLMs) in performing root cause analysis (RCA) of radiation oncology incidents using narrative reports from the Radiation Oncology Incident Learning System (RO-ILS), and to assess their potential utility in supporting patient safety efforts.
Methods
We prompted four state-of-the-art LLMs, Gemini 2.5 Pro, GPT-4o, o3, and Grok 3, with the “Background and Incident Overview” sections from 19 publicly available RO-ILS cases. Each model was instructed to perform RCA and generate root causes, lessons learned, and suggested actions using a standardized prompt based on AAPM RCA guidelines. Model outputs were evaluated using a combination of objective semantic similarity metrics (cosine similarity via Sentence Transformer), semi-subjective assessments (precision, recall, F1-score, accuracy, hallucination rate and performance criteria including relevance, comprehensiveness, quality of justification and quality of solution), and subjective ratings (reasoning quality and overall performance) by five board-certified medical physicists.
Results
LLMs demonstrated satisfactory performance across evaluation metrics. GPT-4o achieved the highest cosine similarity (0.831), and Gemini 2.5 Pro had the highest recall (0.799) and accuracy (0.918). All models exhibited some degree of hallucination, ranging from 11% to 61%. Gemini 2.5 Pro, which outperformed all other models across performance evaluation criteria, received an overall performance rating of 4.8 out of 5 from expert reviewers. Statistically significant differences were observed among models in accuracy, hallucination rate, and subjective ratings (p < 0.05).
Conclusion
LLMs delivered promising results as assistive tools for RCA in radiation oncology, with the ability to generate relevant and accurate analyses aligned with expert expectations. LLMs may support incident analysis and contribute to quality improvement efforts to advance patient safety in clinical radiation oncology practice.