Combining CT scans and clinical features for improved automated COVID-19 detection

R. HACKING

Master thesis 2021.

During the first peak of the COVID-19 pandemic, hospitals in hard-hit regions were overflowing with patients at the emergency unit with respiratory complaints. Since the RT-PCR test was in limited supply at the time and test results took a long time to obtain, many hospitals opted to use chest CT scans of COVID-19 suspects.

As a result of this, several studies examined the possibility of automating the detection of COVID-19 in CT scans. One such study, by Lessmann et al., 2020, developed a model to predict COVID-19 severity scores based on these chest CT scans. In this thesis, we extended their model in several ways to take into account additional clinical values (such as blood values, sex, and age) to predict either PCR outcomes or clinical diagnoses.

Based on data from the Canisius-Wilhelmina Ziekenhuis (CWZ) hospital and Radboudumc hospitals, as well as the COVID-19 dataset by Ning et al., 2020, we found that integrating these two modalities can indeed lead to improved performance when both clinical and visual features are of sufficient quality. When training on data from the CWZ hospital and evaluating on data from the Radboudumc hospital, models using only clinical features or visual features achieved Area Under the ROC Curve (AUC) values of 0.773 and 0.826, respectively; their combination resulted in an AUC of 0.851.

Similarly, when training on data from the Union hospital in the iCTCF dataset and predicting on data from the Union hospital in that same dataset, we obtained AUCs of 0.687 and 0.812 for clinical and visual features, respectively; their combination resulted in an AUC of 0.862.

However, we also discovered that the patterns of missing data present in these clinical feature datasets can play an essential role in the performance of the models fitted on them. We thus developed additional methods to analyze and mitigate this effect to obtain fairer evaluations and increase model generalizability. Still, the high diagnostic performance of some of our models suggests that they could be adapted into clinical practice, and our methods pertaining to missing data could be used to aid further research using clinical feature datasets.