Title: Efficient composite pattern finding from monad patterns

Authors: Jianjun Zhou, Jorg Sander, Guohui Lin

Addresses: Bioinformatics Research Group, Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada. ' Bioinformatics Research Group, Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada. ' Bioinformatics Research Group, Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada

Abstract: Automatically identifying frequent composite patterns in DNA sequences is an important task in bioinformatics, especially when all the basic elements (or monad patterns) of a composite pattern are weak. In this paper, we compare one straightforward approach to assemble the monad patterns into composite patterns to two other rather complex approaches. Both our theoretical analysis and empirical results show that this overlooked straightforward method can be several orders of magnitude faster. Furthermore, different from the previous understandings, the empirical results show that the runtime superiority among the three approaches is closely related to the insignificance of the monad patterns.

Keywords: pattern finding; monad patterns; composite patterns; runtime complexity; bioinformatics; pattern recognition; DNA sequences.

DOI: 10.1504/IJBRA.2007.011836

International Journal of Bioinformatics Research and Applications, 2007 Vol.3 No.1, pp.86 - 99

Published online: 26 Dec 2006 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article