Paper Proffered Program Therapy Physics

LLM-Assisted FMEA: Automating Risk Analysis for Radiation Therapy

Abstract

Purpose

Failure Mode and Effects Analysis (FMEA) is the standard for proactive risk management in radiation therapy under AAPM TG-100 guidelines, yet traditional implementation can exceed 100 hours per process and shows significant inter-rater variability. This work presents a multi-agent LLM system that assists with FMEA generation while keeping clinicians in control, making TG-100 compliance practical for resource-limited institutions.

Methods

We developed a locally-deployed LLM controller that coordinates seven specialized sub-modules for systematic FMEA. The system takes radiotherapy process descriptions and runs sub-modules for failure mode identification, root cause analysis, and scoring of occurrence (O), severity (S), and detectability (D) per TG-100 criteria. Each scorer uses domain-specific prompts with TG-100 scale definitions, expected distributions from published FMEAs, and calibration guidance. A Validator agent checks output quality against a confidence threshold (≥0.7) and triggers up to three refinement attempts with specific feedback when needed. Clinicians can add context, define custom failure modes, or override scores through the GUI. Local deployment via Ollama ensures patient data stays on-site.

Results

While the system performs the complete FMEA workflow, our initial validation focused on the RPN scoring components. We compared LLM-generated O, S, and D scores against 30 failure modes from a published Gamma Knife radiosurgery FMEA. Bland-Altman analysis over three runs showed good agreement for occurrence (mean bias: +0.28, SD: 0.96) and moderate agreement for detectability (mean bias: +0.71, SD: 1.14). Severity scores showed a large positive bias (mean bias: +5.30, SD: 2.00), indicating consistent overestimation of potential harm requiring further calibration.

Conclusion

This proof-of-concept shows that LLM-based FMEA automation is feasible for radiation oncology. The system supports expert judgment rather than replacing it. Future work will focus on multi-modality validation, addressing the severity scoring bias, and evaluating the system's failure mode identification and root cause analysis components.

People

Sadiki DanielAuthors · Louisiana State University Nathan DobranskiPresenting Author · Louisiana State University Ara Alexandrian, PhD, MSAuthors · Mary Bird Perkins Cancer Center Garrett M. Pitcher, PhDAuthors · Mary Bird Perkins Cancer Center

Similar sessions

Poster Poster Program

Jul 19 · 07:00

Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD