Generating dermatopathology reports from gigapixel whole slide images with HistoGPT

M. Tran, P. Schmidle, R. Guo, S. Wagner, V. Koch, V. Lupperger, B. Novotny, D. Murphree, H. Hardway, M. D'Amato, J. Lefkes, D. Geijs, A. Feuchtinger, A. Böhner, R. Kaczmarczyk, T. Biedermann, A. Amir, A. Mooyaart, F. Ciompi, G. Litjens, C. Wang, N. Comfere, K. Eyerich, S. Braun, C. Marr and T. Peng

Nature Communications 2025;16.

DOI PMID

Abstract

Histopathology is the reference standard for diagnosing the presence and nature of many diseases, including cancer. However, analyzing tissue samples under a microscope and summarizing the findings in a comprehensive pathology report is time-consuming, labor-intensive, and non-standardized. To address this problem, we present HistoGPT, a vision language model that generates pathology reports from a patient's multiple full-resolution histology images. It is trained on 15,129 whole slide images from 6705 dermatology patients with corresponding pathology reports. The generated reports match the quality of human-written reports for common and homogeneous malignancies, as confirmed by natural language processing metrics and domain expert analysis. We evaluate HistoGPT in an international, multi-center clinical study and show that it can accurately predict tumor subtypes, tumor thickness, and tumor margins in a zero-shot fashion. Our model demonstrates the potential of artificial intelligence to assist pathologists in evaluating, reporting, and understanding routine dermatopathology cases.