Title: A bottom-up clustering algorithm to detect ncRNA molecules with a common secondary structure

Authors: Yair Horesh, Ron Unger

Addresses: Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel. ' Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel

Abstract: Recently, there has been much interest in exploring the universe of non-protein coding RNA molecules that operate in the cell. We suggested an approach using a simple two-dimensional representation of RNA molecules that can identify common structural features of RNA molecules. Here, we address a common situation in which there is a large and diverse population of candidate molecules, and the task is to identify a small subset (or subsets) of RNA molecules that share a common structure. With certain constraints, our algorithm enumerates all possible sets of RNA molecules that have a common structure by first grouping together all molecules that have a single common structural feature and, using an iterative approach, search for subsets that share additional structural motifs. In a computational experiment, we were able to detect members of three small classes of RNA molecules, each containing several dozen members that were mixed in a population of 2778 non-coding sequences common to two trypanosome species.

Keywords: clustering algorithms; RNA secondary structure; dot-matrix; novel ncRNA families; bioinformatics; non-protein coding RNA molecules; common structures.

DOI: 10.1504/IJBRA.2005.007907

International Journal of Bioinformatics Research and Applications, 2005 Vol.1 No.3, pp.292 - 304

Published online: 30 Sep 2005 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article