Feature selection for computer-aided detection: comparing different selection criteria

R. Hupse and N. Karssemeijer

Medical Imaging 2008;6915:691503.

In this study we investigated different feature selection methods for use in computer-aided mass detection. The data set we used (1357 malignant mass regions and 58444 normal regions) was much larger than used in previous research where feature selection did not directly improve the performance compared to using the entire feature set. We introduced a new performance measure to be used during feature selection, defined as the mean sensitivity in an interval of the free response operating characteristic (FROC) curve computed on a logarithmic scale. Thismeasure is similar to the final validation performance measure we were optimizing. Therefore it was expected to give better results than more general feature selection criteria. We compared the performance of feature sets selected using the mean sensitivity of the FROC curve to sets selected using the Wilks? lambda statistic and investigated the effect of reducing the skewness in the distribution of the feature values before performingfeature selection. In the case of Wilks? lambda, we found that reducing skewness had a clear positive effect, yielding performances similar or exceeding performances obtained when the entire feature set was used. Our results indicate that a general measure like Wilks? lambda selects better performing feature sets than the mean sensitivity of the FROC curve.