Agentic Registration: When Natural Language Guidance Helps and When It Hurts
Abstract
Purpose
Current registration tools lack mechanisms for specifying clinical priorities. We developed a system accepting plain-text instructions (e.g., "prioritize tumor alignment for dose accumulation") and evaluated whether text guidance produces reliable, structure-specific outcomes.
Methods
An LLM interprets clinical text to generate binary masks from RTSTRUCT contours, then executes masked rigid registration. Testing used 4D-Lung data (3 patients, 0% to 50% respiratory phase). Five conditions were compared: global baseline, tumor-focused, spine-focused, heart-focused, and combined heart-spine. Prompt consistency was evaluated using eight phrasings per clinical goal. Failure modes included typos, non-existent structures, and ambiguous requests. Dice coefficients were calculated by applying transforms to contour coordinates.
Results
Text guidance produced measurably different transforms: tumor focus shifted 12.5mm from baseline, spine 4.1mm, heart 8.2mm. Structure-focused registration improved target alignment for rigid anatomy (cord Dice +7.5%, tumor +2.5%) but degraded deformable structures (heart Dice -11%). Prompt consistency testing showed identical transforms across all semantic variants (variance <0.001mm). Invalid inputs (misspellings, missing structures) defaulted to global registration without user notification, creating 2mm discrepancies from intended behavior.
Conclusion
Text-guided registration produces consistent, structure-specific alignment. Benefits depend on anatomy type: rigid structures improve while deformable organs may be harmed by focused rigid registration. Silent failure on invalid input requires correction before clinical deployment.