Using Raman Spectroscopy and Machine Learning to Predict Occurrence of Radiation Pneumonitis In Lung Cancer Patients
Abstract
Purpose
To predict the occurrence of clinically significant radiation pneumonitis in lung cancer patients using Raman spectroscopy on pre-treatment plasma samples.
Methods
Raman spectroscopy was performed on dried plasma from 142 patients: 70 patients presenting radiation pneumonitis (RP, grade ≥ 2 post-treatment) and 72 patients as controls (RP grade = 0 post-treatment). Spectral pre-processing, including cosmic ray removal, spectral smoothing, background subtraction, and normalization, was performed prior to subsequent analysis using an in-house algorithm. Group-and-basis restricted non-negative matrix factorization (GBR-NMF) was employed to extract relative concentrations of plasma-specific biomarkers from each spectrum. The GBR-NMF concentrations specifically for cholesterol and triglycerides were validated against lipid panel results acquired in parallel with Raman spectral collection. Multivariate statistical analyses were applied to identify spectral differences between the two outcome groups. Machine learning algorithms, including logistic regression (LR) and linear discriminant analysis (LDA) models, were then used to assess the predictive performance of the GBR-NMF biomarkers and spectral regions of interest.
Results
A discriminative spectral region was found between 617-629.8 cm-1 when comparing the spectra of the two outcome groups. A LR model with leave-one-patient-out cross-validation predicting the RP occurrence yielded an AUC = 0.687 and a permutation p-value of 0.0003 (5000 permutations). Using GBR-NMF-derived biomarker concentrations, the performance of machine-learning models yielded AUCs ranging from 0.58 to 0.66 depending on the selected combination of metabolites incorporated into the model. Glucose, asparagine and mannose consistently ranked among the top predictive features based on logistic regression model coefficients. In each case, GBR-NMF-derived cholesterol and triglyceride concentrations used in the predictive modelling were validated against lipid panel measurements, with majority yielding strong correlations (Pearson’s r > 0.7).
Conclusion
These preliminary findings suggest that Raman spectroscopy of pre-treatment plasma captures features that may be used to predict whether a patient will develop radiation pneumonitis following radiotherapy.