Poster Poster Program Therapy Physics

Development and Evaluation of a Specialized Large Language Model for Radiotherapy Knowledge Q&a

Abstract
Purpose

General-purpose large language models (LLMs) often exhibit limitations in specialized fields like radiotherapy, including insufficient expertise, outdated knowledge, and a tendency to generate hallucinations. To address these issues, this study developed a specialized Q&A system for radiotherapy by integrating a domain-specific knowledge base with Retrieval-Augmented Generation (RAG) and domain-adaptive fine-tuning of the DeepSeek model.

Methods

A structured knowledge base was constructed using AAPM reports, clinical guidelines (e.g., NCCN, ASTRO,CACA), textbooks, and peer-reviewed literature. Texts were cleaned, deduplicated, and embedded for vector-based retrieval. The system employed an RAG architecture: user queries triggered retrieval of relevant document snippets, which were combined with the query to generate evidence-based answers via the DeepSeek model. We compared a baseline general DeepSeek model against a domain-fine-tuned version optimized on a curated radiotherapy Q&A dataset. Five senior radiation oncologists and medical physicists evaluated responses to 100 predefined questions covering dosimetry, contouring, quality control, and recent advances using a 5-point Likert scale (1: inaccurate, 5: highly accurate). ROUGE scores were calculated to assess text similarity with reference answers.

Results

The specialized system achieved an average accuracy score of 4.5 ± 0.7, significantly outperforming the general DeepSeek model (3.6 ± 1.0, p< 0.01). For technical parameters (e.g., quality control standards), the specialized system scored 4.7, with 95% of answers citing authoritative sources. The ROUGE-L score of the specialized system (0.391) exceeded the general model’s score (0.287), confirming superior alignment with expert responses.

Conclusion

The integration of a high-quality domain-specific knowledge base with RAG and domain-adaptive fine-tuning significantly enhances the accuracy and reliability of the DeepSeek model in radiotherapy. The developed specialized system effectively mitigates hallucinations, provides traceable and evidence-based answers, and shows strong potential for supporting clinical decision-making.

People

Related

Similar sessions

Poster Poster Program
Jul 19 · 07:00
Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD
Therapy Physics 0 people interested
Poster Poster Program
Jul 19 · 07:00
User Expectations and Current Availability of HDR Brachytherapy Audits In Europe

The aim of this work was to evaluate the need to implement more dosimetric audits in high‐dose‐rate brachytherapy (HDR-BT) in Europe and to identify which characteristics such audits should meet according to users.

Javier Vijande, PhD Laura Oliver Cañamás
Therapy Physics 0 people interested