Authors: Daniel Honbo; Amit Pande; Alok Choudhary
Addresses: EECS Department, Northwestern University, Evanston, IL 60208, USA ' Department of Computer Science, University of California Davis, CA 95616, USA ' EECS Department, Northwestern University, Evanston, IL 60208, USA
Abstract: Sequence comparison is one of the most fundamental computational problems in bioinformatics. Pairwise sequence alignment methods align two sequences using a substitution matrix consisting of pairwise scores of aligning different residues with each other (like BLOSUM62), and give an alignment score for the given sequence-pair. This work addresses the problem of accurately estimating statistical significance of pairwise alignment for the purpose of identifying related sequences, by making the sequence comparison process more sequence-specific. Specifically, we develop algorithms for sequence-specific strategies for hardware acceleration of pairwise sequence alignment in conjunction with statistical significance estimation. Using pairwise statistical significance has been shown to give better retrieval accuracy compared to database statistical significance reported by popular database search programmes like BLAST and PSI-BLAST. We provide a 'flexible array' hardware architecture which provides a scalable systolic array suitable for both long and short sequences. The results with Xtremedata XD1000 FPGA platform show a speed-up by up to a factor of more than 200.
Keywords: pairwise statistical significance; PSS; field-programmable gate arrays; FPGA; sequence alignment; bioinformatics.
International Journal of High Performance Systems Architecture, 2013 Vol.4 No.3, pp.121 - 131
Available online: 23 Jul 2013 *Full-text access for editors Access for subscribers Purchase this article Comment on this article