Title: Gene subsets extraction based on Mutual-Information-based Minimum Spanning Trees model

Authors: Jieyue He, Fang Zhou, Wei Zhong, Yi Pan

Addresses: School of Computer Science and Engineering, Southeast University, Nanjing 210096, China. ' Department of Computer Science, University of Helsinki, P.O. Box. 68, FI-00014 Helsinki, Finland. ' Division of Math and Computer Science, University of South Carolina Upstate, Spartanburg, SC 29303, USA. ' Department of Computer Science, Georgia State University, 34 Peachtree Street, Room 1417, Atlanta, GA 30303, USA

Abstract: In microarray data analysis, filter methods with low time complexity neglect correlation among genes. Metrics to calculate the correlation in some of the methods can not effectively reflect function similarity among genes and time complexity is based on the whole gene set. Therefore, a novel selection model called Mutual-Information-based Minimum Spanning Trees (MIMST) is proposed in this paper, which first uses filter methods to remove non-relevant genes, then computes the interdependence of top-ranked genes, and eliminates the redundant genes. The empirical results show that MIMST can find the smallest significant genes subset with higher classification accuracy compared with other methods.

Keywords: gene expressions; microarray data analysis; gene selection; MST; minimum spanning trees; mutual information; gene subsets; subset extraction; filtering; classification accuracy.

DOI: 10.1504/IJCBDD.2009.028823

International Journal of Computational Biology and Drug Design, 2009 Vol.2 No.2, pp.187 - 203

Published online: 03 Oct 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article