The impact of numbers of readers and methods of arbitration on pulmonary nodule detection in the context of lung cancer screening with CT

A. Nair, S. Desai, C. J., A. Edey, S. Walsh, G. Robinson, J. Field, D. Baldwin, R. Vliegenthart, M. Oudkerk, B. van Ginneken, P. de Jong, M. Prokop, D. Hansell and A. Devaraj

Annual Meeting of the European Society of Thoracic Imaging 2012.

Aim: To determine whether a) increasing numbers of readers and b) methods of arbitration significantly influence nodule detection. Methods: 85 CTs performed as part of the NELSON lung cancer screening trial were read by five experienced thoracic radiologists twice. During the first reading, radiologists classified all opacities as positive (nodules>3mm) or negative (non-nodular opacities and nodules <3mm). In the second reading, each radiologist categorised the opacities identified by the other radiologists. Readers' final scores were combined to simulate double- and triple-reading. For double-reading, if there was disagreement a third independent reader provided arbitration. For triple-reading, >=2 radiologists in agreement constituted a positive reading. The reference standard was agreement by >=4 radiologists. Results: 531 opacities were identified, of which 186 (35.0%) nodules met reference standard criteria. Double-reading without arbitration had a variable impact on nodule detection: there was a significant increase in mean sensitivity (22.6%, p<0.005) and specificity (11.3%, p<0.05) in 5 and 16 pairs respectively, a significant reduction in mean sensitivity (20.7%, p<0.005) in 2 pairs, and no change in mean sensitivity and specificity in 13 and 4 pairs respectively. Triple-reading or double-reading with arbitration significantly increased mean sensitivity in 29/30 triple-reader combinations (24.5%, p<0.05) and 6/10 pairs (17.7%, p<0.05) respectively, but there was a significantly decreased mean specificity in 26/30 triple-reader combinations (11.3%, p<0.05) and 10/10 pairs respectively (11.7%, p<0.0005). Conclusions: Double-reading does not invariably improve nodule detection accuracy for experienced thoracic radiologists. However improved sensitivity is achieved by triple-reading or double-reading with independent arbitration, at the expense of reduced specificity.