Title: Hierarchical classification of G-Protein-Coupled Receptors with data-driven selection of attributes and classifiers

Authors: A. Secker, M.N. Davies, A.A. Freitas, E.B. Clark, J. Timmis, D.R. Flower

Addresses: Computing Laboratory and Centre for BioMedical Informatics, University of Kent, Canterbury, CT2 7NF, UK. ' Edward Jenner Institute, Compton, Newbury, Berkshire, RG20 7NNF, UK. ' Computing Laboratory and Centre for BioMedical Informatics, University of Kent, Canterbury, CT2 7NF, UK. ' Departments of Computer Science and Electronics, University of York, York, YO10 5DD, UK. ' Departments of Computer Science and Electronics, University of York, York, YO10 5DD, UK. ' Edward Jenner Institute, Compton, Newbury, Berkshire, RG20 7NNF, UK

Abstract: We address the important bioinformatics problem of predicting protein function from a protein|s primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.

Keywords: hierarchical classification; supervised learning; attribute selection; feature selection; classifier selection; protein function prediction; GPCR; G-protein coupled receptor; bioinformatics.

DOI: 10.1504/IJDMB.2010.032150

International Journal of Data Mining and Bioinformatics, 2010 Vol.4 No.2, pp.191 - 210

Published online: 11 Mar 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article