Towards Registration-Free PET-CT Enhancement Via Resolution-Aware Latent Diffusion
Abstract
Purpose
While PET-CT imaging holds promise for simulation-free radiotherapy workflows, its inherent image resolution limits its use for accurate tumor and organ-at-risk (OAR) contouring. This study aims to enhance the spatial resolution of PET-CT by leveraging a resolution-aware latent diffusion model guided by high-resolution but limited field of view (FOV) chest CT images, without requiring spatial alignment between modalities.
Methods
We propose a three-stage latent diffusion framework. Firstly, a medical variational autoencoder (VAE) was trained to encode PET-CT and chest CT slices into a compact latent space. Secondly, a resolution-aware latent diffusion model was trained using textual prompts highlighting image resolution learn semantic representations of image resolution. Chest CT and synthetically degraded chest CT slices were used to learn high- and low-resolution priors, while PET-CT slices served as additional low-resolution examples. Thirdly, a conditional diffusion model was trained to enhance PET-CT latent embeddings. To preserve anatomical consistency during enhancement, we incorporated structural guidance losses, including a segmentation loss, the sum of squared vesselness measure difference (SSVMD), and a cycle consistency loss. Enhanced PET-CT images were then decoded using the pretrained autoencoder. Although chest CT and PET-CT were not spatially registered, the semantic priors learned from the second stage enabled registration-free enhancement. The model was trained on 150 patients and tested on 20 patients.
Results
Enhanced PET-CT images demonstrated improved visual quality compared with raw PET-CT, with clearer anatomical boundaries, reduced noise, and better structural continuity in lung regions. These improvements suggest the proposed framework can enhance spatial resolution while preserving anatomical fidelity in a registration-free setting.
Conclusion
We present a registration-free PET-CT enhancement framework based on resolution-aware latent diffusion. By leveraging semantic resolution priors and structural guidance losses, the proposed method improves image resolution and anatomical consistency, offering a promising direction toward simulation-free radiotherapy planning.