Title: A comprehensive evaluation of machine learning techniques for cancer class prediction based on microarray data

Authors: Khalid Raza; Atif N. Hasan

Addresses: Department of Computer Science, Jamia Millia Islamia (Central University), New Delhi 110025, India ' Department of Computer Science, Jamia Millia Islamia (Central University), New Delhi 110025, India

Abstract: Prostate cancer is among the most common cancer in males and its heterogeneity is well known. The genomic level changes can be detected in gene expression data and those changes may serve as standard model for any random cancer data for class prediction. Various techniques were implied on prostate cancer data set in order to accurately predict cancer class including machine learning techniques. Large number of attributes but few numbers of samples in microarray data leads to poor training; therefore, the most challenging part is attribute reduction or non-significant gene reduction. In this work, a combination of interquartile range and t-test is used for attribute reduction. Further, a comprehensive evaluation of ten state-of-the-art machine learning techniques for their accuracy in class prediction of prostate cancer is done. Out of these techniques, Bayes Network outperformed with an accuracy of 94.11% followed by Naïve Bayes with an accuracy of 91.17%.

Keywords: cancer class prediction; machine learning; microarray analysis; prostate cancer; Bayes network; bioinformatics.

DOI: 10.1504/IJBRA.2015.071940

International Journal of Bioinformatics Research and Applications, 2015 Vol.11 No.5, pp.397 - 416

Accepted: 20 Apr 2015
Published online: 24 Sep 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article