Dissimilarity-based classification in the absence of local ground truth: application to the diagnostic interpretation of chest radiographs

Y. Arzhaeva, D. Tax and B. van Ginneken

Pattern Recognition 2009;42(9):1768-1776.

DOI Cited by ~29

In this paper classification on dissimilarity representations is applied to medical imaging data with the task of discrimination between normal images and images with signs of disease. We show that dissimilarity-based classification is a beneficial approach in dealing with weakly labeled data, i.e. when the location of disease in an image is unknown and therefore local feature-based classifiers cannot be trained. A modification to the standard dissimilarity-based approach is proposed that makes a dissimilarity measure multi-valued, hence, able to retain more information. A multi-valued dissimilarity between an image and a prototype becomes an image representation vector in classification. Several classification outputs with respect to different prototypes are further integrated into a final image decision. Both standard and proposed methods are evaluated on data sets of chest radiographs with textural abnormalities and compared to several feature-based region classification approaches applied to the same data. On a tuberculosis data set the multi-valued dissimilarity-based classification performs as well as the best region classification method applied to the fully labeled data, with an area under the receiver operating characteristic (ROC) curve (Az) of 0.82. The standard dissimilarity-based classification yields Az=0.80. On a data set with interstitial abnormalities both dissimilarity-based approaches achieve Az=0.98 which is closely behind the best region classification method.