We introduce a multi-institutional data harvesting (MIDH) method for longitudinal observation of medical imaging utilization and reporting. By tracking both large-scale utilization and clinical imaging results data, the MIDH approach is targeted at measuring surrogates for important disease-related observational quantities over time. To quantitatively investigate its clinical applicability, we performed a retrospective multi-institutional study encompassing 13 healthcare systems throughout the United States before and after the 2020 COVID-19 pandemic. Using repurposed software infrastructure of a commercial AI-based image analysis service, we harvested data on medical imaging service requests and radiology reports for 40,037 computed tomography pulmonary angiograms (CTPA) to evaluate for pulmonary embolism (PE). Specifically, we compared two 70-day observational periods, namely (i) a pre-pandemic control period from 11/25/2019 through 2/2/2020, and (ii) a period during the early COVID-19 pandemic from 3/8/2020 through 5/16/2020. Natural language processing (NLP) on final radiology reports served as the ground truth for identifying positive PE cases, where we found an NLP accuracy of 98% for classifying radiology reports as positive or negative for PE based on a manual review of 2,400 radiology reports. Fewer CTPA exams were performed during the early COVID-19 pandemic than during the pre-pandemic period (9806 vs. 12,106). However, the PE positivity rate was significantly higher (11.6 vs. 9.9%, p < 10−4) with an excess of 92 PE cases during the early COVID-19 outbreak, i.e., ~1.3 daily PE cases more than statistically expected. Our results suggest that MIDH can contribute value as an exploratory tool, aiming at a better understanding of pandemic-related effects on healthcare.
ASJC Scopus subject areas
- Medicine (miscellaneous)
- Health Informatics
- Computer Science Applications
- Health Information Management