Title: A study of the repetitive structure and distribution of short motifs in human genomic sequences

Authors: Abanish Singh, Cedric Feschotte, Nikola Stojanovic

Addresses: Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019, USA. ' Department of Biology, University of Texas at Arlington, Arlington, TX 76019, USA. ' Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX 76019, USA

Abstract: Over the last several years the search for functional genomic elements by exploiting motif over-representation became increasingly popular. However, about half of the human genome is repetitive, and that is also the case with most higher eukaryotes. In this study we have shown that in addition to these known repeats, human sequences feature many short over-represented motifs, and that their frequency varies only slightly between random repeat-masked sequences and regions located immediately upstream of the known genes. Most of our study has been performed on the ENCODE sequences, which comprise about 1% of the human genome.

Keywords: DNA; repeated sequences; functional elements; sequence motifs; bioinformatics; human genome; short motifs.

DOI: 10.1504/IJBRA.2007.015419

International Journal of Bioinformatics Research and Applications, 2007 Vol.3 No.4, pp.523 - 535

Published online: 15 Oct 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article