Trustworthy AI for automated screening of retinal diseases

C. González-Gonzalo

  • Promotor: Sánchez, C. I. and B. van Ginneken
  • Copromotor: Sánchez, C. I. and B. van Ginneken
  • Graduation year: 2023
  • Radboud University, Nijmegen
  • Download thesis


This thesis contributes to the current research (academic and industrial), regulatory, and ethical landscapes by advancing our understanding of trustworthiness in AI systems in healthcare, particularly in the context of ophthalmology. Its overall objective is to provide insights, explore solutions, and provide recommendations for the development of trustworthy DL-based systems, thereby contributing to lessen the existing gap between the development and integration of AI in healthcare and ophthalmology.

Chapter 2. In this chapter, we study the reliability of a CE-certified, DL-based device for the joint automated screening of DR and AMD in CFP. By performing an external, multi-center validation, we investigate the ability of the commercially-available system to generalize across populations and imaging acquisition protocols. We also compare its performance to that of a group of international retinal experts, and explore the consistency of human observers when it comes to DR and AMD grading. Our work supports that AI can facilitate access to joint screening of retinal diseases and that currently available AI solutions can provide reliable and objective support to eye care providers.

Chapter 3. In this chapter, we focus on the explainability of DL systems' decisions and its impact on trust and clinical usability. We propose a deep visualization method, called visual evidence augmentation, to enhance DL models' explainability in classification tasks in medical imaging. The novel method combines visual attribution and selective inpainting and iteratively unveils abnormalities responsible for anomalous predictions, without the need of manual, lesion-level annotations. We apply the method to automated screening of DR and AMD in CFP, and demonstrate its ability to improve weakly-supervised localization of different types of abnormalities. With this work, we contribute to opening the "black box" of AI and hence increasing experts' trust and facilitating its integration in clinical settings.

Chapter 4. In this chapter, we focus on the robustness of DL systems against malicious attacks and the importance of defining their actual threat. We study previously unexplored factors affecting the vulnerability of DL systems to adversarial attacks in three different medical applications and imaging modalities: screening for referable DR in CFP, classification of pathologies in chest X-Ray, and detection of breast cancer metastasis in histopathology slides of lymph node sections. We demonstrate that ImageNet pre-training, commonly used in medical imaging, may substantially increase adversarial attack vulnerability, and that disparity in the training data of the target and the attacker's model decreases attack performance. This work also provides recommendations to increase the safety of DL systems meant to be clinically mdeployed and to perform realistic evaluations of adversarial robustness.

Chapter 5. In this chapter, we explore the main aspects and challenges to be considered along the AI design pipeline in ophthalmology so as to generate systems that meet the requirements to be deemed trustworthy, including those concerning accuracy, resiliency, reliability, safety, and accountability. We elaborate on mechanisms to address those aspects and challenges at specific points of patient care, and define the roles, responsibilities, and interactions between the different stakeholders involved in AI for ophthalmic care. This study plays a role in establishing the basis for a greatly needed collaborative approach, as well as identifying key action points to ensure the potential benefits of AI reach real-world ophthalmic settings. The main findings from this work can be translated to other medical specialties.