Large Cohort Selection of Medical Imaging Exams from MIDRC Using the LOINC/RSNA Radiology Playbook
Abstract
Purpose
To develop artificial intelligence (AI) methods with improved repeatability and reproducibility, a recognized need is for large, publicly available, and curated imaging data sets. Notable public repositories of DICOM images exist, including the Medical Image and Data Resource Center (MIDRC), which is a partnership of the AAPM, RSNA, and ACR. MIDRC currently has over 500,000 studies (exams) in process, of which over 200,000 have been processed and publicly available. A challenge for users is to select appropriate cohorts using the highly variable, free text Study Descriptions in the DICOM metadata supplied by the providing imaging centers. The purpose of this effort was to map those free text Study Descriptions into a standardized, controlled vocabulary to facilitate efficient cohort creation.
Methods
We used the Logical Observation Identifiers Names and Codes (LOINC), to serve as our controlled vocabulary, which uses the RSNA Radlex playbook. Each LOINC code has an algorithmically-generated and unique LOINC Long Common Name. For each Modality and Study Description pair that had 10 or more occurrences, a Long Common Name was assigned in a 'MIDRC-LOINC Mapping Table'. Filtering attributes based on modality, body region and contrast agent were also generated. We applied this mapping for all of the publicly available exams in the MIDRC database.
Results
For the 233,399 exams that are currently publicly available, we were able to match over 98% of the Study Descriptions using 597 unique LOINC Long Common Names. The MIDRC Data Explorer page allows filtering by LOINC Long Common name to build cohorts for download. The MIDRC-LOINC Mapping Table and the filtering attributes are publicly available on Github.
Conclusion
The MIDRC-LOINC Mapping Table along with the filtering attributes, allows for fast and accurate selection of image cohorts from the MIDRC collection. This process can be applied to other image repositories.