Poster Poster Program Therapy Physics

Quantifying Differences between Clinical Practices Using Natural Language Processing for AI Model Generalization In Radiation Oncology

Abstract
Purpose

Many Artificial intelligence (AI) applications have been developed recently to improve the quality and efficiency of radiotherapy processes. Despite the success of image-based applications such as autocontouring, AI models trained with alphanumeric clinical data suffer from the lack of generalizability due to differences in clinical practices. In this multi-institutional study, we aim to quantify differences between clinical data from different radiation oncology (RO) practices using natural language processing and create a quantity that can inform the potential usability of an AI model in a clinic without going through the complete validation process.

Methods

Anonymized RO datasets including tumor locations, prescription, treatment planning and setup parameters are extracted from three US institutions and one European institution. Each clinical dataset was preprocessed locally into tokenized corpora and converted into multidimensional vectors using Word2Vec model with continuous bag-of-words methodology. The generated word vectors are then shared, and weighted cosine similarity (Csim) between the vectors were calculated to quantify differences in prescription, plan complexity, and treatment approaches of various anatomic tumor locations across different institutions.

Results

We computed Csims for prescription patterns and plan complexity of various tumor sites between different institutions. We observed that for similar practices, Csims would range between 0.7-0.9, and for more diverging practices, Csims would be lower. For example, for gastrointestinal prescription patterns, the Csims between US institutions would range from 0.85-0.87, while the Csims between the European institution and US institutions are ranging from 0.23 to 0.30, indicating an observable difference between the datasets.

Conclusion

This study shows that Csim can capture differences between clinical datasets caused by differences in clinical practices. With the outlined vector space modeling framework, AI researchers can create suitable corpora for comparison and correlate Csim with model performance for validation and QA purposes.

People

Related

Similar sessions

Poster Poster Program
Jul 19 · 07:00
Python-Based Automation Framework for Annual Machine QA Data Archiving In Qatrack+

Annual water-tank measurements help ensure beam characteristics remain consistent with commissioning baselines. However, the lack of a standardized processing workflow and decentralized data storage makes it difficult to analyze...

Syed Bilal Ahmad, PhD
Therapy Physics 0 people interested
Poster Poster Program
Jul 19 · 07:00
User Expectations and Current Availability of HDR Brachytherapy Audits In Europe

The aim of this work was to evaluate the need to implement more dosimetric audits in high‐dose‐rate brachytherapy (HDR-BT) in Europe and to identify which characteristics such audits should meet according to users.

Javier Vijande, PhD Laura Oliver Cañamás
Therapy Physics 0 people interested