Title: Mining Cytochrome b561 proteins from plant genomes

Authors: Stephen O. Opiyo, Etsuko N. Moriyama

Addresses: School of Biological Sciences, University of Nebraska-Lincoln, NE 68588, USA. ' School of Biological Sciences and Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68588-0118, USA

Abstract: Cytochrome b561 (Cyt-b561) proteins are important for plant growth, development, and prevention of damage to plants. Because of their high sequence divergence, thorough mining of Cyt-b561 proteins from plant genomes are not easy. Currently there is only one Cyt-b561 gene found in the maize and none in the soybean genome. However, 22 have been identified in the Arabidopsis thaliana genome. We tested alignment-free protein classifiers based on partial least squares (PLS) and support vector machines to identify Cyt-b561. These classifiers performed better than profile hidden Markov models and PSI-BLAST. Using these classifiers we identified new Cyt-b561-related proteins from four plant genomes.

Keywords: Cytochrome b561 proteins; PLS; partial least squares; SVMs; support vector machines; profile hidden Markov models; PSI-BLAST; plant genomes; bioinformatics; protein classifiers; data mining.

DOI: 10.1504/IJBRA.2010.032122

International Journal of Bioinformatics Research and Applications, 2010 Vol.6 No.2, pp.209 - 221

Published online: 10 Mar 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article