Title: Struct-NB: predicting protein-RNA binding sites using structural features

Authors: Fadi Towfic, Cornelia Caragea, David C. Gemperline, Drena Dobbs, Vasant Honavar

Addresses: Bioinformatics and Computational Biology Graduate Program, Iowa State University, Ames, IA 50011-1040, USA. ' Department of Computer Science, Iowa State University, Ames, IA 50011-1040, USA. ' Department of Biology, Department of Chemistry, Carthage College, 2001 Alford Park Drive, Kenosha, WI 53140-1994, USA. ' Department of Genetics, Development and Cell Biology, Bioinformatics and Computational Biology Graduate Program, Iowa State University, Ames, IA 50011-1040, USA. ' Department of Computer Science, Bioinformatics and Computational Biology Graduate Program, Iowa State University, Ames, IA 50011-1040, USA

Abstract: We analyse sequence and structural features of protein-RNA interfaces using RB-147, a non-redundant dataset of protein-RNA complexes extracted from the PDB. We train classifiers using machine learning algorithms to predict protein-RNA interfaces from sequence and structure-derived features of proteins. Our experiments show that Struct-NB, a Naive Bayes classifier that exploits structural features, outperforms its counterparts that use only sequence features to predict protein-RNA binding residues.

Keywords: protein-RNA interactions; propensity; structural features; protein sequences; RNA; Bayes classifiers; machine learning; bioinformatics; protein features; protein-RNA binding.

DOI: 10.1504/IJDMB.2010.030965

International Journal of Data Mining and Bioinformatics, 2010 Vol.4 No.1, pp.21 - 43

Published online: 14 Jan 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article