A Robust Ensemble Algorithm for Ischemic Stroke Lesion Segmentation: Generalizability and Clinical Utility Beyond the ISLES Challenge

E. de la Rosa, M. Reyes, S. Liew, A. Hutton, R. Wiest, J. Kaesmacher, U. Hanning, A. Hakim, R. Zubal, W. Valenzuela, D. Robben, D. Sima, V. Anania, A. Brys, J. Meakin, A. Mickan, G. Broocks, C. Heitkamp, S. Gao, K. Liang, Z. Zhang, M. Siddiquee, A. Myronenko, P. Ashtari, S. Van Huffel, H. Jeong, C. Yoon, C. Kim, J. Huo, S. Ourselin, R. Sparks, A. Clèrigues, A. Oliver, X. Lladó, L. Chalcroft, I. Pappas, J. Bertels, E. Heylen, J. Moreau, N. Hatami, C. Frindel, A. Qayyum, M. Mazher, D. Puig, S. Lin, C. Juan, T. Hu, L. Boone, M. Goubran, Y. Liu, S. Wegener, F. Kofler, I. Ezhov, S. Shit, M. Petzsche, B. Menze, J. Kirschke and B. Wiestler

arXiv:2403.19425 2024.

DOI arXiv

Diffusion-weighted MRI (DWI) is essential for stroke diagnosis, treatment decisions, and prognosis. However, image and disease variability hinder the development of generalizable AI algorithms with clinical value. We address this gap by presenting a novel ensemble algorithm derived from the 2022 Ischemic Stroke Lesion Segmentation (ISLES) challenge. ISLES'22 provided 400 patient scans with ischemic stroke from various medical centers, facilitating the development of a wide range of cutting-edge segmentation algorithms by the research community. Through collaboration with leading teams, we combined top-performing algorithms into an ensemble model that overcomes the limitations of individual solutions. Our ensemble model achieved superior ischemic lesion detection and segmentation accuracy on our internal test set compared to individual algorithms. This accuracy generalized well across diverse image and disease variables. Furthermore, the model excelled in extracting clinical biomarkers. Notably, in a Turing-like test, neuroradiologists consistently preferred the algorithm's segmentations over manual expert efforts, highlighting increased comprehensiveness and precision. Validation using a real-world external dataset (N=1686) confirmed the model's generalizability. The algorithm's outputs also demonstrated strong correlations with clinical scores (admission NIHSS and 90-day mRS) on par with or exceeding expert-derived results, underlining its clinical relevance. This study offers two key findings. First, we present an ensemble algorithm (https://github.com/Tabrisrei/ISLES22_Ensemble) that detects and segments ischemic stroke lesions on DWI across diverse scenarios on par with expert (neuro)radiologists. Second, we show the potential for biomedical challenge outputs to extend beyond the challenge's initial objectives, demonstrating their real-world clinical applicability.