Title: Transductive learning with EM algorithm to classify proteins based on phylogenetic profiles

Authors: Roger A. Craig, Li Liao

Addresses: Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA. ' Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA

Abstract: We proposed a novel method for protein classification based on phylogenetic profiles. Each protein|s profile was extended with extra bits encoding the phylogenetic tree structure and the likelihood, in the form of weights on profile indices, of the protein|s functional family membership in each of the reference genomes. The extended profiles were then integrated as part of a kernel of a support vector machine, which was trained in a transductive learning scheme using the EM algorithm to update the weights. Classification accuracy was greatly increased when tested on the proteome of Saccharomyces cerevisiae using the MIPS classification as a benchmark.

Keywords: protein classification; transductive learning; support vector machines; SVMs; phylogenetic profiles; EM algorithm; data mining; bioinformatics.

DOI: 10.1504/IJDMB.2007.012964

International Journal of Data Mining and Bioinformatics, 2007 Vol.1 No.4, pp.337 - 351

Published online: 02 Apr 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article