The objectives in this study were to design and test a fully automated method for classification of microcalcification clusters into malignant and benign types, and to compare the method's performance with that of radiologists. A novel aspect of the approach is that the relative location and orientation of clusters inside the breast was taken into account for feature calculation. Furthermore, correspondence of location of clusters in mediolateral oblique (MLO) and cranio-caudal (CC) views, was used in feature calculation and in final classification. Initially, microcalcifications were automatically detected by using a statistical method based on Bayesian techniques and a Markov random field model. To determine malignancy or benignancy of a cluster, a method based on two classification steps was developed. In the first step, classification of clusters was performed and in the second step a patient based classification was done. A total of 16 features was used in the study. To identify meaningful features, a feature selection was applied, using the area under the receiver operating characteristic (ROC) curve (Az value) as a criterion. For classification the k-nearest-neighbor method was used in a leave-one-patient-out procedure. A database of 192 mammograms with 280 true positive detected microcalcification clusters was used for evaluation of the method. The set consisted of cases that were selected for diagnostic work up during a 4 year period of screening in the Nijmegen region (The Netherlands). Because of the high positive predictive value in the screening program (50%), this set did not contain obvious benign cases. The method's best patient-based performance on this set corresponded with Az = 0.83, using nine features. A subset of the data set, containing mammograms from 90 patients, was used for comparing the computer results to radiologists' performance. Ten radiologists read these cases on a light-box and assessed the probability of malignancy for each patient. All participants had experience in clinical mammography and participated in our observer study during the last 2 days of a 2-week training session leading to screening mammography certification. Results on the subset showed that the method's performance (Az = 0.83) was considerably higher than that of the radiologists (Az = 0.63).