Computationally Efficient Low Field MRI Denoising Via Foundation Model Adaptation
Abstract
Purpose
The clinical utility of low field MRI is limited by inherently low signal-to-noise ratio (SNR). Effective feature modeling plays a vital role in image denoising yet modeling long-range feature dependencies are computationally expensive. This study investigates adaptation of a pretrained foundation model for computationally efficient low field MRI denoising.
Methods
T1- and T2-weighted MRI acquired on a low field (0.3 T) scanner were investigated. High-SNR ground truth was obtained through signal averaging over repeated acquisitions. A pretrained transformer network from an image segmentation foundation model was used to extract global features from low field MRI images. An adaptor module was designed to process and fuse the transformer features with local features extracted by a lightweight convolutional neural network (CNN) before decoding into high-SNR outputs. The proposed model was first trained using 128 T1-weighted MRI volumes. Model finetuning with 32 T2-weighted MRI volumes was further performed to evaluate domain transfer learning performance. We compared the proposed model to both CNN- and transformer-based models, including Unet, NAFnet and RESTORMER. Denoising performance was evaluated by calculating PSNR and SSIM between model outputs and ground truth. Computational complexity was quantified using the number of trainable parameters, peak GPU memory usage and inference time.
Results
The proposed model outperformed both CNN- and transformer-based models significantly (p<0.0001), achieving PSNR/SSIM values of 36.95 dB/0.929 and 35.91 dB/0.913 for T1- and T2-weighted MRI denoising, respectively. The computational complexity of the proposed model was comparable to CNN-based models and lower than transformer-based models, with 64.6% less GPU memory usage and 36.5% shorter inference time.
Conclusion
Foundation models pretrained for other medical imaging tasks can be adapted for low field MRI denoising. The adapted model exploits both long-range and local feature dependencies, improving image denoising quality without increasing model complexity.