Title: RadixHap: a radix tree-based heuristic for solving the single individual haplotyping problem

Authors: Tai-Chun Wang; Javid Taheri; Albert Y. Zomaya

Addresses: Centre for Distributed and High Performance Computing, School of Information Technologies, University of Sydney, NSW 2006, Australia ' Centre for Distributed and High Performance Computing, School of Information Technologies, University of Sydney, NSW 2006, Australia ' Centre for Distributed and High Performance Computing, School of Information Technologies, University of Sydney, NSW 2006, Australia

Abstract: Single nucleotide polymorphism studies have recently received significant amount of attention from researchers in many life science disciplines. Previous researches indicated that a series of SNPs from the same chromosome, called haplotype, contains more information than individual SNPs. Hence, discovering ways to reconstruct reliable Single Individual Haplotypes becomes one of the core issues in the whole-genome research nowadays. However, obtaining sequence from current high-throughput sequencing technologies always contain inevitable sequencing errors and/or missing information. The SIH reconstruction problem can be formulated as bi-partitioning the input SNP fragment matrix into paternal and maternal sections to achieve minimum error correction; a problem that is proved to be NP-hard. In this study, we introduce a greedy approach, named RadixHap, to handle data sets with high error rates. The experimental results show that RadixHap can generate highly reliable results in most cases. Furthermore, the algorithm structure of RadixHap is particularly suitable for whole-genome scale data sets.

Keywords: single individual haplotypes; minimum error correction; greedy algorithm; radix tree; bioinformatics; single nucleotide polymorphisms; SNPs; gene sequencing; whole genome data sets.

DOI: 10.1504/IJBRA.2015.067336

International Journal of Bioinformatics Research and Applications, 2015 Vol.11 No.1, pp.10 - 29

Received: 17 May 2012
Accepted: 10 May 2013

Published online: 06 Feb 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article