Deep Learning–Based Classification of Thyroid Nodules Using Bimodal Ultrasound Imaging
Abstract
Purpose
Thyroid nodules, characterized by abnormal cell growth within the thyroid gland, pose a significant diagnostic challenge, particularly when distinguishing between benign and malignant nodules. Accurate differentiation is crucial to prevent unnecessary fine needle aspiration biopsy (FNAB) and surgical resection, which, despite being effective, are invasive and associated with patient discomfort, anxiety, and financial burden. These limitations underscore the need for non-invasive diagnostic alternatives. To address this need, we propose a bi-modal ultrasound–based deep learning framework designed to improve benign–malignant thyroid nodule classification.
Methods
This prospective study included 350 subjects. Bi-modal ultrasound imaging, including B-mode and shear wave elastography (SWE), was acquired prior to biopsy and used for model development. The proposed network employs a customized architecture built with depth-wise separable convolutional layers, incorporating attention-based mixed pooling and a tailored self-attention mechanism. Bi-modal feature fusion was performed via input-level concatenation of the two modalities. The dataset was split 80/10/10% for training/validation/testing, with augmentation including random cropping, rotation, and zooming. Training used an Adam optimizer (learning rate 1×10⁻⁴) and binary cross-entropy loss.
Results
On the held-out test set, the proposed method achieved an accuracy of 0.95, an F1-score of 0.92, an ROC AUC of 0.97, and a precision of 0.97. Performance comparisons with two existing networks confirmed the superiority of the proposed method.
Conclusion
Integrating bi-modal ultrasound imaging with a customized, attention-enhanced deep learning architecture improves thyroid nodule classification. This non-invasive approach may reduce unnecessary biopsies and downstream interventions while maintaining high diagnostic performance.