Title: Evaluation of machine learning techniques for prostate cancer diagnosis and Gleason grading

Authors: Eleni Alexandratou, Vassilis Atlamazoglou, Trias Thireou, George Agrogiannis, Dimitrios Togas, Nikolaos Kavantzas, Efstratios Patsouris, Dido Yova

Addresses: Laboratory of Biomedical Optics and Applied Biophysics, School of Electrical and Computer Engineering, National Technical University of Athens, Iroon Politechniou 9, Zografou Campus, 157 80 Athens, Greece. ' Laboratory of Biomedical Optics and Applied Biophysics, School of Electrical and Computer Engineering, National Technical University of Athens, Iroon Politechniou 9, Zografou Campus, 157 80 Athens, Greece. ' Laboratory of Biomedical Optics and Applied Biophysics, School of Electrical and Computer Engineering, National Technical University of Athens, Iroon Politechniou 9, Zografou Campus, 157 80 Athens, Greece. ' 1st Department of Pathology, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Street, Goudi, GR-115 27 Athens, Greece. ' 1st Department of Pathology, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Street, Goudi, GR-115 27 Athens, Greece. ' 1st Department of Pathology, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Street, Goudi, GR-115 27 Athens, Greece. ' 1st Department of Pathology, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Street, Goudi, GR-115 27 Athens, Greece. ' Laboratory of Biomedical Optics and Applied Biophysics, School of Electrical and Computer Engineering, National Technical University of Athens, Iroon Politechniou 9, Zografou Campus, 157 80 Athens, Greece

Abstract: Although the gold standard for prostate cancer tissue grading has been the Gleason grading scheme, it is strongly affected by |inter- and intra observer variations|. Therefore, the development of objective and reproducible computer-aided classification methods is of critical importance. In this paper, 16 supervised machine learning algorithms were compared based on their performance on prostate cancer diagnosis and Gleason grading. The classification problems addressed were: tumour vs. non-tumour, low vs. high grade; and the four class problem of diagnosis and grading. Thirteen Haralick texture characteristics were calculated based on grey level co-occurrence matrix of microscopic prostate tissue. For the best performing algorithm in each case the accuracy obtained was 97.9% for diagnosis (tumour-non-tumour), 80.8% for low-high grade discrimination and 77.8% for accomplishing both diagnosis and Gleason grading. Logistic regression and sequential minimal optimisation for training a support vector machine were among the four top scoring algorithms in each classification problem.

Keywords: Haralick features; Gleason grading; machine learning; data mining; prostate cancer; cancer diagnosis; tissue grading; tumour classification.

DOI: 10.1504/IJCIBSB.2010.031392

International Journal of Computational Intelligence in Bioinformatics and Systems Biology, 2010 Vol.1 No.3, pp.297 - 315

Published online: 02 Feb 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article