Title: A rule-based approach for RNA pseudoknot prediction

Authors: X.Z. Fu, H. Wang, R.W. Harrison, W.L. Harrison

Addresses: Department of Computer Science, Georgia State University, Atlanta GA 30303, USA. ' Department of Computer Science, Georgia State University, Atlanta GA 30303, USA. ' Department of Computer Science and Biology, Georgia State University, Atlanta GA 30303, USA. ' Department of Computer Science, University of Missouri, Columbia MO 65211, USA

Abstract: RNA plays a critical role in mediating every step of cellular information transfer from genes to functional proteins. Pseudoknots are functionally important and widely occurring structural motifs found in all types of RNA. Therefore predicting their structures is an important problem. In this paper, we present a new RNA pseudoknot structure prediction method based on term rewriting. The method is implemented using the Mfold RNA/DNA folding package and the term rewriting language Maude. In our method, RNA structures are treated as terms and rules are discovered for predicting pseudoknots. Our method was tested on 211 pseudoknots in PseudoBase and achieves an average accuracy of 74.085% compared to the experimentally determined structure. In fact, most pseudoknots discovered by our method achieve an accuracy of above 90%. These results indicate that term rewriting has a broad potential in RNA applications ranging from prediction of pseudoknots to discovery of higher level RNA structures involving complex RNA tertiary interactions.

Keywords: term rewriting; rule discovery; RNA pseudoknot prediction; pseudoknots; data mining; bioinformatics; RNA structures.

DOI: 10.1504/IJDMB.2008.016757

International Journal of Data Mining and Bioinformatics, 2008 Vol.2 No.1, pp.78 - 93

Published online: 21 Jan 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article