Deep Learning with Bayesian Inference for Prostate Cancer Diagnosis across Longitudinal Biparametric MRI

A. Saha, J.S. Bosma, C. Roest, M. Hosseinzadeh, J. Futterer and H. Huisman

Annual Meeting of the Radiological Society of North America 2021.

BACKGROUND: Despite increasing use of active surveillance via biparametric MR imaging (bpMRI) for prostate cancer (PCa) management, there is a lack of research in medical image computing that utilize longitudinal studies to assist present-day diagnosis. PURPOSE: To investigate the efficacy of a deep learning-based PCa detection model, that integrates past bpMRI exams and population-level ana-tomical priors via Bayesian inference. MATERIALS AND METHODS: This retrospective study included 250 consecutive biopsy-naive men (median age: 64 yrs; IQR: 60-69) with elevated levels of PSA (me-dian level: 8 ng/mL; IQR: 5-11), who underwent at least two consecutive MRI exams between 2016-2018 (N=500). Intra-patient bpMRI scans were rigidly registered, paired with expert voxel-level annotations of PI-RADS 2-5 lesions and subsequently used to train a deep learning model to predict and localize all PI-RADS findings. Radiologists utilize prior studies to inform present-day diagnosis. Similar-ly, for each patient case, computer-aided diagnosis for the follow-up bpMRI exam was derived via Bayesian modelling; probabilistical-ly integrating past bpMRI exams and a population prior for spatial PCa prevalence and zonal anatomy. Diagnostic performance was evaluated by the ability to accurately discriminate patients with benign prostatic tissue (n=10) or PI-RADS <= 3 lesions (n=159), from those carrying PI-RADS >= 4 lesions (n=81), over 5-fold cross-validation. Normalized Wilcoxon Mann-Whitney U statistic was used to derive AUROC and confidence intervals were computed over 5000 replications of bootstrapping. RESULTS: Computer-aided diagnosis of follow-up studies without priori, yielded an AUROC of 0.77 (95% CI: 0.71, 0.83), F0.5 score of 0.51 (95% CI: 0.42, 0.61), positive predictive value (PPV) of 0.51 (95% CI: 0.40, 0.60) and negative predictive value (NPV) of 0.78 (95% CI: 0.71, 0.83). Computer-aided diagnosis of follow-up studies with the inclusion of priori via Bayesian inference yielded an AUROC of 0.80 (95% CI: 0.76, 0.84), F0.5 score of 0.58 (95% CI: 0.48, 0.68), PPV of 0.60 (95% CI: 0.48, 0.72) and NPV of 0.78 (95% CI: 0.72, 0.85).In comparison to stand-alone diagnosis, factoring in prior studies resulted in a 41.4% reduction of AUROC standard deviation across each fold. CONCLUSION: Incorporating past studies and clinical priors via Bayesian inference can improve diagnostic certainty and robustness of deep learning in follow-up patient exams. CLINICAL RELEVANCE/APPLICATION: Prostate cancer is one of the most prevalent cancers in men worldwide. In the absence of experienced radiologists, its morphological heterogeneity can lead to low inter-reader agreement. Automated, reliable detection algorithms can improve diagnostic accuracy with consistent quantitative analysis.