Title: GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences

Authors: Sergio Deusdado, Paulo Carvalho

Addresses: Centre for Mountain Research (CIMO), Polytechnic Institute of Braganca, 5301-854 Braganca, Portugal. ' Department of Informatics, School of Engineering, University of Minho, 4710-553 Braga, Portugal

Abstract: In this paper, we propose Genomic-oriented Rapid Algorithm for String Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams analysis, which introduces a novel filtering heuristic – the compatibility rule – achieving significant efficiency gain. GRASPm|s foundations rely especially on a wide searching window having the central duplet as reference for fast filtering of multiple alignments. Subsequently, superfluous detailed verifications are summarily avoided by filtering the incompatible alignments using the idcd (involving duplet of central duplet) concept combined with pre-processed conditions, allowing fast parallel testing for multiple alignments. Comparative performance analysis, using diverse genomic data, shows that GRASPm is faster than its competitors.

Keywords: pattern matching; sequence searching; sequence analysis; motif discovery; genomic sequences; bioinformatics; filtering heuristics; genome.

DOI: 10.1504/IJBRA.2009.027510

International Journal of Bioinformatics Research and Applications, 2009 Vol.5 No.4, pp.385 - 401

Published online: 28 Jul 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article