CT-to-Functional Lung Imaging: Simultaneous Synthesis of Perfusion and Ventilation Images Using a Dual-Decoder Residual Attention Network
Abstract
Purpose
To develop a deep learning framework that simultaneously synthesizes lung perfusion and ventilation images from three-dimensional (3D) CT and to evaluate its potential clinical utility.
Methods
Ninety-eight cases with 3D CT, SPECT perfusion image (PI) and ventilation image (VI) were collected. CT and SPECT were registered and cropped to include only the lungs. A dual-decoder residual attention network (DDRAN) was trained to jointly generate PI and VI from CT. In addition, two conventional single-decoder residual attention networks (RAN) were trained separately for PI and VI for comparison. Voxel-wise agreement was assessed using structural similarity (SSIM) and Spearman’s rank correlation coefficient (Rs). Function-wise concordance was evaluated using the Dice similarity coefficient (DSC) in low- and high-functional regions. DDRAN vs. RAN differences were tested with the Wilcoxon signed-rank test. We also performed threshold-based classification and a two-part reader study (image acceptability; illustrative diagnosis from synthesized PI/VI pairs only).
Results
Overall, DDRAN and RAN achieved comparable performance. The average SSIM values of the DDRAN/RAN model were 0.871/0.866 (p<0.05) for PI and 0.830/0.825 (p<0.05) for VI, and the Rs values were 0.836/0.819 and 0.732/0.731, respectively. The DDRAN/RAN model achieved average DSC values of 0.795/0.797 for PI and 0.708/0.718 for VI in low-functional regions, and 0.857/0.849 for PI and 0.794/0.793 for VI in high-functional regions. In two-part reader study, the synthesized perfusion and ventilation images almost received acceptable scores across all experience levels and demonstrated potential in diagnosis.
Conclusion
We proposed a dual-decoder residual attention network that can synthesize lung perfusion and ventilation images from 3D CT images simultaneously. The preliminary results demonstrated moderate-to-high structural-wise and functional-wise concordances, and our proposed model achieved comparable accuracy when bench-marked against single-decoder models. The synthesized perfusion and ventilation images can potentially be used for precise diagnosis and guiding functional lung avoidance radiotherapy.