Title: An improved gSVM-SCADL2 with firefly algorithm for identification of informative genes and pathways

Authors: Weng Howe Chan; Mohd Saberi Mohamad; Safaai Deris; Juan Manuel Corchado; Sigeru Omatu; Zuwairie Ibrahim; Shahreen Kasim

Addresses: Artificial Intelligence and Bioinformatics Research Group, Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia ' Artificial Intelligence and Bioinformatics Research Group, Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia ' Faculty of Creative Technology & Heritage, Universiti Malaysia Kelantan, Locked Bag 01, 16300 Bachok, Kota Bharu, Kelantan, Malaysia ' Biomedical Research Institute of Salamanca/BISITE Research Group, University of Salamanca, Salamanca, Spain ' Department of Electronics, Information and Communication Engineering, Osaka Institute of Technology, Osaka 535-8585, Japan ' Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang, 26600 Pekan, Pahang, Malaysia ' Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Batu Pahat, Malaysia

Abstract: Incorporation of pathway knowledge into microarray analysis has been favoured by researchers owing to the improved biological interpretation of the analysis outcome. However, most of the pathway data are manually curated without specific biological context. Inclusion of non-informative genes in the analysis of context specific microarray data could lead to classifier with poor discriminative power. Thus, one of the main challenges is how to effectively identify informative genes from the pathway data. This paper proposes a firefly optimised penalised support vector machine with SCADL2 penalty function (SVM-SCADL2-FFA) in optimising tuning parameters for each pathway for efficient identification of informative genes and pathways. Experiments are done on lung cancer and gender data sets. Tenfold CV is used to evaluate the performance in terms of accuracy, specificity, sensitivity and F-score. The identified informative genes are validated through online databases. Our proposed method shows consistent improvements compared to previous works.

Keywords: pathway-based microarray analysis; gene selection; penalised SVM; support vector machines; bioinformatics; artificial intelligence; firefly algorithm; genes; pathways; lung cancer; gender.

DOI: 10.1504/IJBRA.2016.075404

International Journal of Bioinformatics Research and Applications, 2016 Vol.12 No.1, pp.72 - 93

Received: 14 Mar 2015
Accepted: 27 Sep 2015

Published online: 19 Mar 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article