Assessment of individual tumor buds using keratin immunohistochemistry: moderate interobserver agreement suggests a role for machine learning

J. Bokhorst, A. Blank, A. Lugli, I. Zlobec, H. Dawson, M. Vieth, L. Rijstenberg, S. Brockmoeller, M. Urbanowicz, J. Flejou, R. Kirsch, F. Ciompi, J. van der Laak and I. Nagtegaal

Modern Pathology 2019.

DOI PMID Cited by ~31

Tumor budding is a promising and cost-effective biomarker with strong prognostic value in colorectal cancer. However, challenges related to interobserver variability persist. Such variability may be reduced by immunohistochemistry and computer-aided tumor bud selection. Development of computer algorithms for this purpose requires unequivocal examples of individual tumor buds. As such, we undertook a large-scale, international, and digital observer study on individual tumor bud assessment. From a pool of 46 colorectal cancer cases with tumor budding, 3000 tumor bud candidates were selected, largely based on digital image analysis algorithms. For each candidate bud, an image patch (size 256 x 256 um) was extracted from a pan cytokeratin-stained whole-slide image. Members of an International Tumor Budding Consortium (n = 7) were asked to categorize each candidate as either (1) tumor bud, (2) poorly differentiated cluster, or (3) neither, based on current definitions. Agreement was assessed with Cohen's and Fleiss Kappa statistics. Fleiss Kappa showed moderate overall agreement between observers (0.42 and 0.51), while Cohen's Kappas ranged from 0.25 to 0.63. Complete agreement by all seven observers was present for only 34% of the 3000 tumor bud candidates, while 59% of the candidates were agreed on by at least five of the seven observers. Despite reports of moderate-to-substantial agreement with respect to tumor budding grade, agreement with respect to individual pan cytokeratin-stained tumor buds is moderate at most. A machine learning approach may prove especially useful for a more robust assessment of individual tumor buds.