Benchmarking of Artificial Intelligence and Radiologists for Indeterminate Lung Nodule Malignancy Risk Estimation on Screening CT: Results of the LUNA25 Challenge

D. Peeters, B. Obreja, N. Antonissen, Z. Saghir, U. Pastorino, G. De Bock, R. Vliegenthart, M. Prokop and C. Jacobs

European Congress of Radiology 2026.

Purpose:

Accurate risk classification of indeterminate (5-15mm) lung nodules can reduce unnecessary follow-up in lung cancer screening. AI may assist in risk classification, however, benchmarking studies are limited. Here, we present the results of the LUNA25 challenge, a public competition that evaluates AI and radiologist performance for malignancy risk estimation of indeterminate nodules at screening CT.

Methods:

LUNA25 consists of an AI study and a reader study. For AI development, participants had access to a public dataset of 4069 CT scans from the National Lung Cancer Screening Trial (NLST), with 555 malignant and 5608 benign nodules. AI evaluation was performed on an external test set with 156 malignant and 312 benign indeterminate solid and part-solid nodules from baseline scans of the Danish (DLCST), Dutch-Belgian (NELSON), and Italian (MILD) lung cancer screening trials. For the reader study, radiologists assessed 300 nodules from the test set, assigning each a malignancy risk score (0–100) and management recommendation (low, intermediate, or high-risk). Performance was compared using area under the ROC curve (AUC), sensitivity, and specificity.

Results:

On the subset of 300 nodules, the top-performing AI system showed a statistically superior AUC of 0.78 (95% CI :0.73-0.84, p<0.001) in comparison to the average AUC of 75 readers with an AUC of 0.69 (95% CI :0.64-0.74). At the ≥ indeterminate risk threshold, the AI correctly classified 12% more malignant cases at matched specificity, and 20% fewer false-positives at matched sensitivity.

Conclusion:

The top-performing AI system demonstrated statistically significant superior performance compared to the average radiologist in estimating malignancy risk for indeterminate lung nodules detected on screening CT, highlighting its potential use as a decision-support tool.