Title: FPGA architecture for pairwise statistical significance estimation

Authors: Daniel Honbo; Amit Pande; Alok Choudhary

Addresses: EECS Department, Northwestern University, Evanston, IL 60208, USA ' Department of Computer Science, University of California Davis, CA 95616, USA ' EECS Department, Northwestern University, Evanston, IL 60208, USA

Abstract: Sequence comparison is one of the most fundamental computational problems in bioinformatics. Pairwise sequence alignment methods align two sequences using a substitution matrix consisting of pairwise scores of aligning different residues with each other (like BLOSUM62), and give an alignment score for the given sequence-pair. This work addresses the problem of accurately estimating statistical significance of pairwise alignment for the purpose of identifying related sequences, by making the sequence comparison process more sequence-specific. Specifically, we develop algorithms for sequence-specific strategies for hardware acceleration of pairwise sequence alignment in conjunction with statistical significance estimation. Using pairwise statistical significance has been shown to give better retrieval accuracy compared to database statistical significance reported by popular database search programmes like BLAST and PSI-BLAST. We provide a 'flexible array' hardware architecture which provides a scalable systolic array suitable for both long and short sequences. The results with Xtremedata XD1000 FPGA platform show a speed-up by up to a factor of more than 200.

Keywords: pairwise statistical significance; PSS; field-programmable gate arrays; FPGA; sequence alignment; bioinformatics.

DOI: 10.1504/IJHPSA.2013.055222

International Journal of High Performance Systems Architecture, 2013 Vol.4 No.3, pp.121 - 131

Received: 17 Nov 2012
Accepted: 22 Jan 2013

Published online: 25 Jul 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article