Lung cancer is the leading cause of cancer death among both men and women, accounting for nearly 25% of all cancer deaths. While lung cancer typically shows up as pulmonary nodules on CT images, most nodules are benign and do not require further clinical workup. Accurately distinguishing between benign and malignant nodules is therefore crucial to catch lung cancers early. In a paper published in Radiology, we developed and externally validated a deep learning algorithm for estimating the malignancy risk of lung nodules in low-dose CT scans.
We developed the algorithm with 16,077 nodules (1,249 cancers) from the National Lung Screening Trial. We further externally validated the algorithm with 883 nodules (65 cancers) from the Danish Lung Cancer Screening Trial. In the external validation dataset, our algorithm achieved an AUC of 0.93. It significantly outperformed the clinically established PanCan model. We compared the algorithm with a group of 11 clinicians in cancer-enriched subsets. This group included 4 thoracic radiologists, 5 radiology residents, and 2 pulmonologists. The algorithm performed comparably with thoracic radiologists.
We have made the algorithm freely accessible to the public for research purposes at grand-challenge.org.
Some caveats: the algorithm is highly suitable for nodules seen at baseline screening. But for nodules seen at subsequent screenings, the growth and appearance in comparison to the previous CT images are important. We did not calibrate the risk scores in this study. Therefore, the algorithm tends to be over-confident with its predictions - a known problem with deep learning algorithms.
[UPDATE] Since September 2021, a calibrator has been added to the algorithm and the method behind the calibration is explained here.