Title: Alignment of biological sequences with quality scores

Authors: Joong Chae Na, Kangho Roh, Alberto Apostolico, Kunsoo Park

Addresses: Department of Computer Engineering, Sejong University, Seoul 143-747, South Korea. ' Memory Division, Semiconductor Business, Samsung Electronics Co., Ltd., Hwasung 445-701, South Korea. ' Accademia Nazionale dei Lincei and DEI, Universita di Padova, Italy; College of Computing, Georgia Institute of Technology, 801 Atlantic Drive, Atlanta, GA 30332, USA. ' School of Computer Science and Engineering, Seoul National University, Seoul 151-742, South Korea

Abstract: In this paper we consider the problem of sequence alignment with quality scores. DNA sequences produced by a base-calling program (as part of sequencing) have quality scores which represent the confidence level for individual bases. However, previous sequence alignment algorithms do not consider such quality scores. To solve sequence alignment with quality scores, we first consider a more general problem where the input is weighted sequences which are sequences with probabilities that characters occur in each position. We propose a meaningful measure of an alignment of two weighted sequences and show that an optimal alignment in this measure can be found by dynamic programming. Sequence alignment with quality scores can be solved as a special case of the weighted sequence alignment problem.

Keywords: sequence alignment; quality scores; DNA sequences; bioinformatics; biological sequences; weighted sequencing; dynamic programming.

DOI: 10.1504/IJBRA.2009.022466

International Journal of Bioinformatics Research and Applications, 2009 Vol.5 No.1, pp.97 - 113

Published online: 07 Jan 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article