Classification of mammographic masses using support vector machines and Bayesian networks

M. Samulski, N. Karssemeijer, P. Lucas and P. Groot

Medical Imaging 2007;6514(1):65141J.

DOI Cited by ~13

In this paper, we compare two state-of-the-art classification techniques characterizing masses as either benign or malignant, using a dataset consisting of 271 cases (131 benign and 140 malignant), containing both a MLO and CC view. For suspect regions in a digitized mammogram, 12 out of 81 calculated image features have been selected for investigating the classification accuracy of support vector machines (SVMs) and Bayesian networks (BNs). Additional techniques for improving their performance were included in their comparison: the Manly transformation for achieving a normal distribution of image features and principal component analysis (PCA) for reducing our high-dimensional data. The performance of the classifiers were evaluated with Receiver Operating Characteristics (ROC) analysis. The classifiers were trained and tested using a k-fold cross-validation test method (k=10). It was found that the area under the ROC curve (Az) of the BN increased significantly (p=0.0002) using the Manly transformation, from Az = 0.767 to Az = 0.795. The Manly transformation did not result in a significant change for SVMs. Also the difference between SVMs and BNs using the transformed dataset was not statistically significant (p=0.78). Applying PCA resulted in an improvement in classification accuracy of the naive Bayesian classifier, from Az = 0.767 to Az = 0.786. The difference in classification performance between BNs and SVMs after applying PCA was small and not statistically significant (p=0.11).