A Cancer Genomics Cloud Project for Radiological and Radiotherapy Image Analyses
Abstract
Purpose
This work presents a public project on the Cancer Genomics Cloud (CGC) platform for reproducible, out-of-the-box application of analysis workflows for pre-trained radiological and radiotherapy AI models; the first and only such project supporting analysis of radiotherapy images.
Methods
The main components of our software framework include the Python-native Computational Environment for Radiological Research (pyCERR) platform for radiological image processing and the CGC, an NCI-funded resource, allowing users to collaborate on projects hosted in the cloud by providing tools for managing collaborators and hardware from providers such as AWS and GCP. CGC provides access to over 3 Petabytes of publicly available data from the Cancer Research Data Commons (CRDC) ecosystem along with data access from repositories such as TCGA, TCIA and NHLBI BioData Catalyst. Analyses pipelines, distributed as Apps, consist of a Docker container with software dependencies, pre-configured hardware requirements, input/output types and scripts for inference. Apps can be run interactively from a web browser or a client computer using the REST API and can be chained together to build analysis workflows. pyCERR is used for data transformation, radiomics and dosimetric computations and visualization and provides simplified access to metadata from various radiological image modalities for downstream analysis. AI inference from radiological images via this project can be readily combined with data types such as genomics and proteomics from other CGC projects.
Results
Our public project for AI analyses of radiological and radiotherapy images is available on the CGC platform which includes Apps for H&N-OAR segmentation using 2D architectures and a foundational 3D SMIT model for segmenting thoracic and H&N-OARs and lung nodules, trained in-house. We have deployed clinically validated AI segmentation of over 70 organs utilizing pyCERR utilities.
Conclusion
The presented public project on CGC simplifies dissemination and use of medical image analysis and incorporation of multi-modal data.