Authors: Shinuk Kim; Hyowon Lee; Mark Kon
Addresses: Department of Civil Engineering, Sangmyung University, Cheonan 31066, South Korea ' Department of Biomedical Technology, Sangmyung University, Cheonan 31066, South Korea ' Department of Mathematics and Statistics, Boston University, Boston, MA 02215, USA
Abstract: Studies in computational cancer genomics have been faced with the challenge of increasing prediction accuracy of molecular datasets. Here we outline how a feature selection method combined with machine learning may help overcome this challenge for BRCA microRNA-Seq datasets, BRCA RNA-Seq and mRNA microarray datasets, and BLCA microRNA_seq and RNA_seq datasets. We used three different computational approaches: (a) support vector machine, (b) decision tree and (c) k nearest neighbours, and two different feature selection methods: (a) Fisher feature criterion and (b) infinite feature selection. Our computational approaches performed consistently better with RNA_seq datasets rather than with miRNA_seq or RNA_array datasets.
Keywords: feature selection; machine learning methods; classification methods; RNA_sequence; miRNA_sequence; breast invasive carcinoma; bladder urothelial carcinoma.
International Journal of Data Mining and Bioinformatics, 2017 Vol.17 No.4, pp.359 - 368
Available online: 05 Aug 2017 *