Deep learning universal lesion segmentation for automated RECIST measurements on CT: comparison to manual assessment by radiologists

M. Grauw, B. Ginneken, B. Geisler, E. Smit, M. Rooij, S. Schalekamp and M. Prokop

European Congress of Radiology 2022.

PURPOSE: Automating aspects of RECIST evaluation can save time and potentially reduce inter-observer variability. We trained a 3D Universal Lesion Segmentation model (ULS) to estimate long and short axis diameters in CT exams based on a single click inside the lesion. METHODS: We used the nnUnet framework to train the ULS using 3213 lesions from 1481 studies collected from eight public challenge datasets. We fine-tuned the model using masks predicted for lesions from a subset of the public DeepLesion dataset. A reader study was conducted with 128 separate DeepLesion scans. Four radiologists manually measured long- and short-axis of lesions on axial CT slices and assessed whether a lesion was eligible as target lesion. RESULTS: For 85 out of 128 scans, all readers agreed that it contained a valid RECIST target lesion. For those lesions, the relative difference between the DeepLesion measurements and the radiologists was -4.2% +- 14.2 and -0.3% +- 13.2, for the long and short axis respectively. For ULS these measures were 6% +- 17 and -5.8% +- 18.9. The mean absolute differences were 2.5 +- 3 mm and 1.9 +- 2 mm for radiologists. For ULS these measures were 4.1 +- 5.8 mm and 2.8 +- 2.7mm. For 78.8% of lesions the absolute difference between DeepLesion and ULS measurements fell within a standard deviation of the inter-radiologist variability. CONCLUSIONS: Single-click measurement using ULS shows promise to simplify and speed-up RECIST evaluation in circa 80% oncological CT exams. LIMITATIONS: This study used a small number of lesions in the test set, and readers measured long and short axis in all lesions, which is not required by RECIST. FUNDING: This research was supported by the Eurostars PIANO project E113829.