Title: Efficient Super Granular SVM Feature Elimination (Super GSVM-FE) model for protein sequence motif information extraction

Authors: Bernard Chen, Stephen Pellicer, Phang C. Tai, Robert Harrison, Yi Pan

Addresses: Department of Computer Science, Georgia State University, 34 Peachtree Street Room1417 A, Atlanta, GA 30303, USA. ' Department of Computer Science, Georgia State University, 34 Peachtree Street Room1417 B, Atlanta, GA 30303, USA. ' Department of Biology, Georgia State University, 402 Kell Hall, 24 Peachtree Center Ave., Atlanta, GA 30303, USA. ' Department of Computer Science, Georgia State University, 34 Peachtree Street Room 1440, Atlanta, GA 30303, USA. ' Department of Computer Science, Georgia State University, 34 Peachtree Street Room 1442, Atlanta, GA 30303, USA

Abstract: Protein sequence motifs are gathering progressively attention in the sequence analysis area. The conserved regions have the potential to determine the conformation, function and activities of the proteins. We develop a new method combines the concept of granular computing and the power of Ranking-SVM to further extract protein sequence motif information generated from the FGK model. The quality of motif information increases dramatically in all three evaluation measures by applying this new feature elimination model. Since the training step of Ranking SVM is very time consuming, we provide a feasible way to reduce the training time dramatically without sacrificing the quality.

Keywords: FIK model; FGK model; ranking SVM; feature elimination; protein sequencing; motif information extraction; protein sequence motifs; granular computing; support vector machines.

DOI: 10.1504/IJFIPM.2008.018290

International Journal of Functional Informatics and Personalised Medicine, 2008 Vol.1 No.1, pp.8 - 25

Published online: 14 May 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article