Title: Pairwise statistical significance and empirical determination of effective gap opening penalties for protein local sequence alignment

Authors: Ankit Agrawal, Volker P. Brendel, Xiaoqiu Huang

Addresses: Department of Computer Science, Iowa State University, 226 Atanasoff Hall, Ames, IA 50011-1041, USA. ' Department of Genetics, Development and Cell Biology, and Department of Statistics, Iowa State University, 2112 Molecular Biology Building, Ames, IA, 50011-3260, USA. ' Department of Computer Science, Iowa State University, 226 Atanasoff Hall, Ames, IA 50011-1041, USA

Abstract: We evaluate various methods to estimate pairwise statistical significance of a pairwise local sequence alignment in terms of statistical significance accuracy and compare it with popular database search programs in terms of retrieval accuracy on a benchmark database. Results indicate that using pairwise statistical significance using standard substitution matrices is significantly better than database statistical significance reported by BLAST and PSI-BLAST, and that it is comparable and at times significantly better than SSEARCH. An application of pairwise statistical significance to empirically determine effective gap opening penalties for protein local sequence alignment using the widely used BLOSUM matrices is also presented.

Keywords: database statistical significance; gap opening penalty; homologs; pairwise local alignment; pairwise statistical significance; local sequence alignment; protein sequences; homology detection; bioinformatics.

DOI: 10.1504/IJCBDD.2008.022207

International Journal of Computational Biology and Drug Design, 2008 Vol.1 No.4, pp.347 - 367

Published online: 22 Dec 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article