Title: Semantic similarity based feature extraction from microarray expression data

 

Author: Young-Rae Cho, Aidong Zhang, Xian Xu

 

Addresses:
Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA.
Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA.
Microsoft Corporation, Redmond, WA 98052, USA

 

Journal: Int. J. of Data Mining and Bioinformatics, 2009 Vol.3, No.3, pp.333 - 345

 

Abstract: Previous studies have proven that it is feasible to build sample classifiers using gene expression profiles. To build an effective sample classifier, dimension reduction process is necessary since classic pattern recognition algorithms do not work well in high dimensional space. In this paper, we present a novel feature extraction algorithm by integrating microarray expression data with Gene Ontology (GO). Applying semantic similarity measures, we identify the groups of genes, called virtual genes, which potentially interact with each other for a biological function. The correlation in expressions of virtual genes is used to classify samples. For colon cancer data, this approach significantly improved the classification accuracy by more than 10%.

 

Keywords: feature extraction; microarray expression data; semantic similarity; bioinformatics; gene ontology; virtual genes; colon cancer data; sample classifiers; gene expression profiles; classification accuracy.

 

DOI: http://dx.doi.org/10.1504/IJDMB.2009.026705

 

Available online 23 Jun 2009

 

 

Editors Full Text AccessAccess for SubscribersPurchase this articleComment on this article