Paper Proffered Program Therapy Physics

Multi-Tier Automated Pipeline for TG-263 Structure Name Standardization Using Rule-Based, NLP, and RAG-Enhanced LLM Matching

Abstract

Purpose

Big-data radiotherapy research faces challenges in achieving consistent structure nomenclature across large hospital networks. This study develops and validates a novel automated pipeline for standardizing structure names. Our multi-tier matching pipeline integrates rule-based algorithms, natural language processing (NLP), and retrieval-augmented generation (RAG) with large language models (LLMs) using a large dataset of 14,962 plans.

Methods

We analyzed 14,962 plans, encompassing 111,566 plan evaluations across 5,386 patients. Organs-at-risk (OAR) were delineated using TG-263-compliant AI auto-contouring tools. The dataset initially contained 7,710 unique structure names, which were normalized to 4,174 names. Ground truth (106 structure names) was established using 73 institutional dose constraint protocols (TG-263 aligned). To automate standardization, a three-tier matching pipeline was developed: Tier 1 (Rule-based) targeted high-frequency patterns and dose-encoded strings via dictionary mapping; Tier 2 (NLP) handled minor typographic and character variations; and Tier 3 (LLM with RAG) utilized LLaMA 3.1 for advanced semantic matching. Performance was benchmarked using usage-weighted match rates, prioritizing high-frequency clinical structures to ensure real-world reliability and dosimetric relevance.

Results

The proposed pipeline achieved a 96.1% usage-weighted match rate, significantly improving compliance from a 66.4% baseline. Usage pattern analysis revealed a highly skewed data distribution, where 127 active protocols (13% of observed protocols) generated 85.7% of the total clinical volume. This finding validated our Tier 1-centric design. Tier 1 captured 82.2% of the volume, while Tier 2 captured 10.6%, and Tier 3 resolved 3.3%, leaving only 3.9% unmatched. Despite observing a 41-fold increase in unique structure names compared to the standardized set, the pipeline effectively managed systematic deviations.

Conclusion

Significant deviations in nomenclature persist despite TG-263-compliant auto-contouring. Our reliable, multi-tier pipeline systematically standardizes non-compliant contours, neutralizing clinical variations. This enables robust large-scale data aggregation and facilitates multi-institutional clinical trials by ensuring participating institutions' plan submissions strictly adhere to standardized protocols.

People

David H. ThomasAuthors · Thomas Jefferson University Hamidreza Nourzadeh, PhDAuthors · Thomas Jefferson University Wookjin Choi, PhDPresenting Author · Thomas Jefferson University Yingxuan Chen, PhDAuthors · Thomas Jefferson University Yevgeniy Vinogradskiy, PhDAuthors · Thomas Jefferson University

Similar sessions

Poster Poster Program

Jul 19 · 07:00

Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD