Image-based automated Psoriasis Area Severity Index scoring by Convolutional Neural Networks

M. Schaap, N. Cardozo, A. Patel, E. de Jong, B. van Ginneken and M. Seyger

Journal of the European Academy of Dermatology and Venereology 2022;36(1):68-75.

Abstract Background The Psoriasis Area and Severity Index (PASI) score is commonly used in clinical practice and research to monitor disease severity and determine treatment efficacy. Automating the PASI score with deep learning algorithms, like Convolutional Neural Networks (CNNs), could enable objective and efficient PASI scoring. Objectives To assess the performance of image-based automated PASI scoring in anatomical regions by CNNs and compare the performance of CNNs to image-based scoring by physicians. Methods Imaging series were matched to PASI subscores determined in real life by the treating physician. CNNs were trained using standardized imaging series of 576 trunk, 614 arm and 541 leg regions. CNNs were separately trained for each PASI subscore (erythema, desquamation, induration and area) in each anatomical region (trunk, arms and legs). The head region was excluded for anonymity. Additionally, PASI-trained physicians retrospectively determined image-based subscores on the test set images of the trunk. Agreement with the real-life scores was determined with the intraclass correlation coefficient (ICC) and compared between the CNNs and physicians. Results Intraclass correlation coefficients between the CNN and real-life scores of the trunk region were 0.616, 0.580, 0.580 and 0.793 for erythema, desquamation, induration and area, respectively, with similar results for the arms and legs region. PASI-trained physicians (N = 5) were in moderate-good agreement (ICCs 0.706-0.793) with each other for image-based PASI scoring of the trunk region. ICCs between the CNN and real-life scores were slightly higher for erythema (0.616 vs. 0.558), induration (0.580 vs. 0.573) and area scoring (0.793 vs. 0.694) than image-based scoring by physicians. Physicians slightly outperformed the CNN on desquamation scoring (0.580 vs. 0.589). Conclusions Convolutional Neural Networks have the potential to automatically and objectively perform image-based PASI scoring at an anatomical region level. For erythema, desquamation and induration scoring, CNNs performed similar to physicians, while for area scoring CNNs outperformed physicians on image-based PASI scoring.