Development and Validation of a Convolutional Neural Network for Automated Detection of Scaphoid Fractures on Conventional Radiographs

N. Hendrix, E. Scholten, B. Vernhout, S. Bruijnen, B. Maresch, M. de Jong, S. Diepstraten, S. Bollen, S. Schalekamp, M. de Rooij, A. Scholtens, W. Hendrix, T. Samson, L. Sharon Ong, E. Postma, B. van Ginneken and M. Rutten

Radiology: Artificial Intelligence 2021:e200260.

DOI Algorithm

Purpose: To compare the performance of a convolutional neural network (CNN) to 11 radiologists in detecting scaphoid fractures on conventional radiographs of the hand, wrist, and scaphoid. Materials and Methods: At two hospitals (Hospitals A and B), three datasets consisting of conventional hand, wrist, and scaphoid radiographs were retrospectively retrieved: a dataset of 1039 radiographs (775 patients [mean age, 48 ± 23 years; 505 females], period: 2017—2019, Hospitals A and B) for developing a scaphoid segmentation CNN, a dataset of 3000 radiographs (1846 patients [mean age, 42 ± 22 years; 937 females], period: 2003—2019, Hospital B) for developing a scaphoid fracture detection CNN, and a dataset of 190 radiographs (190 patients [mean age, 43 ± 20 years; 77 female] period: 2011—2020, Hospital A) for testing the complete fracture detection system. Both CNNs were applied consecutively: the segmentation CNN localized the scaphoid and then passed the relevant region to the detection CNN for fracture detection. In an observer study, the performance of the system was compared with 11 radiologists. Evaluation metrics included the Dice similarity coefficient (DSC), Hausdorff distance (HD), sensitivity, specificity, positive predictive value (PPV), and area under the receiver operating characteristic curve (AUC). Results: The segmentation CNN achieved a DSC of 97.4% ± 1.4 with an HD of 1.31 mm ± 1.03. The detection CNN had a sensitivity of 78% (95% CI: 70, 86), specificity of 84% (95% CI: 77, 92), PPV of 83% (95% CI: 77, 90), and AUC of 0.87 (95% CI: 0.81, 0.91). There was no difference between the AUC of the CNN and the radiologists (0.87 [95% CI: 0.81, 0.91] versus 0.83 [radiologist range: 0.78–0.85]; P = .09). Conclusion: The developed CNN achieved radiologist-level performance in detecting scaphoid fractures on conventional radiographs of the hand, wrist, and scaphoid.