Classifying symmetrical differences and temporal change for the detection of malignant masses in mammography using deep neural networks

T. Kooi and N. Karssemeijer

Journal of Medical Imaging 2017;4(4):International Society for Optics and Photonics.

DOI PMID Cited by ~45

Neural networks, in particular deep Convolutional Neural Networks (CNN), have recently gone through a renaissance sparked by the introduction of more efficient training procedures and massive amounts of raw annotated data. Barring a handful of modalities, medical images are typically too large to present as input as a whole and models are consequently trained with subsets of images or cases, representing the most crucial bits of information. When inspecting a scene to identify objects, humans take cues from not just the article in question but also the elements in its vicinity: a frisbee is more likely to be a plate in the presence of a fork and knife. Similar principles apply to the analysis of medical images: specialists base their judgment of an abnormality on all available data, harnessing information such as symmetrical differences in or between organs in question and temporal change, if multiple recordings are available. \

In this paper we investigate the addition of symmetry and temporal context information to a deep CNN with the purpose of detecting malignant soft tissue lesions in mammography. We employ a simple linear mapping that takes the location of a mass candidate and maps it to either the contra-lateral or prior mammogram and Regions Of Interest (ROI) are extracted around each location. We subsequently explore two different architectures (1) a fusion model employing two datastreams were both ROIs are fed to the network during training and testing and (2) a stage-wise approach where a single ROI CNN is trained on the primary image and subsequently used as feature extractor for both primary and symmetrical or prior ROIs. A 'shallow' Gradient Boosted Tree (GBT) classifier is then trained on the concatenation of these features and used to classify the joint representation. Results shown a significant increase in performance using the first architecture and symmetry information, but only marginal gains in performance using temporal data and the other setting. We feel results are promising and can greatly be improved when more temporal data becomes available.