Title: An algorithm for the reconstruction of consensus sequences of ancient segmental duplications and transposon copies in eukaryotic genomes

Authors: Abanish Singh, Umeshkumar Keswani, David Levine, Cedric Feschotte, Nikola Stojanovic

Addresses: Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019, USA. ' Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019, USA. ' Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019, USA. ' Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA. ' Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019, USA

Abstract: Interspersed repeats, mostly resulting from the activity and accumulation of transposable elements, occupy a significant fraction of many eukaryotic genomes. More than half of human genomic sequence consists of known repeats, however a very large part has not yet been associated with neither repetitive structures nor functional units. We have postulated that most of the seemingly unique content of mammalian genomes is also a result of transposon activity, written software to look for weak signals which would help us reconstruct the ancient elements with substantially mutated copies, and integrated it into a system for de novo identification and classification of interspersed repeats. In this manuscript we describe our approach, and report on our methods for building the consensus sequences of these transposons.

Keywords: algorithms; graphs; DNA sequence analysis; DNA sequence repeats; transposons; consensus sequences; eukaryotic genomes; bioinformatics.

DOI: 10.1504/IJBRA.2010.032118

International Journal of Bioinformatics Research and Applications, 2010 Vol.6 No.2, pp.147 - 162

Published online: 10 Mar 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article