Fully Automatic Measurement of the Splenic Volume in CT with U-Net Convolutional Neural Networks

J. Bukala, G.E. Humpire Mamani, E. Scholten, M. Prokop, B. van Ginneken and C. Jacobs

in: Annual Meeting of the Radiological Society of North America, 2017


Purpose: To develop a fully automatic deep learning method for 3D segmentation of the spleen on computed tomography (CT) scans and to compare the automatically measured spleen volume with the standard splenic index approximation formula that requires three 2D manual measurements. Method and Materials:145 CT thorax-abdomen scans were collected from our institute. All scans were contrast enhanced and acquired with a slice thickness of 1 or 2 mm. The spleens were manually segmented in 3D by trained human observers in all scans. We used 100 scans for training and 45 scans as an independent test set. In the test set, the standard approximation formula was applied by a human observer to get an estimation of the splenic volume. The system fully analyzes the entire thorax-abdomen CT scan to segment the exact location of the spleen, without any need for pre-processing. Multiple U-net convolutional neural networks were trained for different orthogonal directions using the training data set. A validation set consisting of 30% of the training data was used to optimize the hyperparameters of the neural network. A dedicated hard mining selection strategy was employed to improve the learning process. The predictions of the U-nets were averaged and subsequently thresholded to obtain a 3D spleen segmentation. The mean absolute error of the splenic volume was used to measure the accuracy of the deep learning approach and the standard approximation formula in comparison to the manual reference standard. The performance of the deep learning approach was also evaluated by computing the Dice similarity coefficient on the test set. Results: The deep learning approach resulted in a mean absolute error of 8.5% (SD 11.6) in the splenic volume while the approximation formula gave a significantly higher (p<0.01) mean absolute error of 17.7% (SD 14.7). The average Dice score between the deep learning segmentations and the reference segmentations was 0.91 (SD 0.08). Conclusion: Splenic volume can be fully automatically assessed using a U-net deep learning approach, with an accuracy that is substantially better than the clinically widely used approximation formula. Clinical relevance/Application: An accurate splenic volume measurement can be used for assessing splenomegaly and for detecting changes in splenic volume over time.