Language-Guided Agentic AI Framework for Image Registration In Prostate SBRT Using Model Context Protocol
Abstract
Purpose
To develop and validate an agentic AI framework using Model Context Protocol (MCP) that enables clinicians to control CT-MR image registration through natural language guidance. The study aims to determine if text-based clinical intent produces predictable, structure-specific alignment outcomes for prostate stereotactic body radiation therapy (SBRT) treatment planning.
Methods
An MCP server was developed providing specialized registration tools for a Large Language Model agent to invoke dynamically. The framework utilizes natural language parsing to identify focus structures. For this study, three prostate SBRT patients with pre-existing prostate, bladder, and rectum contours on both CT and MRI were used. The agentic framework calls MCP tools to perform a sequence of operations: an initial global Mutual Information (MI) alignment for baseline registration, followed by structure-specific Dice Similarity Coefficient (DSC) optimization via a weighted grid search using the available contours. Cases were processed under four language-driven conditions: one with no specific guidance (baseline MI-only), and three instructing the agent to align specifically to the prostate, bladder, or rectum.
Results
Each alignment method consistently produced the highest DSC for its intended target structure across all patients. Target structure DSC improved by 5-11% over baseline MI registration. The largest improvement occurred for bladder-focused alignment, while smaller improvements were noted when baseline registration already showed high overlap. Non-targeted structures showed expected trade-offs in alignment, consistent with rigid registration physics. Structure-specific refinements resulted in a 1-4 mm shift from baseline translation, confirming that different language instructions produce unique registration transforms.
Conclusion
This work demonstrates a consistent relationship between natural language guidance and registration outcomes. The framework allows clinical teams to achieve anatomical alignment goals through conversational interaction rather than manual parameter adjustment. This MCP-based architecture serves as a foundation for more complex autonomous workflows, where similar natural language interfaces could be utilized for radiotherapy treatment planning.