Assessing the Reliability of Markerless Tumor Tracking Using Machine Learning Models
Abstract
Purpose
Markerless tumor tracking (MTT) using single-energy (SE) kilovoltage (kV) imaging is being considered for lung tumor motion management. However, bone overlap with tumor can reduce visibility and tracking accuracy. Dual-energy (DE) imaging suppresses bone and enhances soft-tissue contrast, potentially improving tumor tracking. However, validating MTT in patient data is challenging because true tumor positions are unknown, making MTT reliability assessment essential for clinical use. As such, the goal of this study is to characterize various machine learning (ML) models to evaluate MTT reliability.
Methods
Alternating 60/120 kV images were acquired of a motion phantom with the on-board imager of a commercial linac using fast kV-switching. DE images were generated offline using weighted logarithmic subtraction. Tumor motion was estimated using a template-based tracking algorithm, with a programmed waveform serving as ground truth. Four ML models (logistic regression (LR), decision tree (DT), random forest (RF), and support vector classifier (SVC)) were trained separately for SE and DE images. Performance was assessed using receiver operating characteristic (ROC) curves, area under the curve (AUC), and sensitivity at a specificity = 0.95.
Results
For SE images, RF achieved the highest performance (AUC = 0.951; sensitivity = 0.883 at 0.95 specificity), while LR showed the lowest AUC (0.875) and sensitivity (0.727); DT and SVC demonstrated intermediate results. For DE images, all models showed improved performance with AUCs ranging from 0.961 to 0.973, and sensitivities from 0.931 to 0.950. LR and RF achieved the highest sensitivities (0.950), while DT and SVC performed slightly lower (0.935 and 0.931, respectively). Overall, DE imaging consistently enhanced performance across models and reduced performance differences related to model complexity.
Conclusion
The proposed ML models effectively evaluated MTT reliability. DE-ML models demonstrated superior performance compared to SE-ML models, and future work will focus on validation using patient data.