Authors: Ehsan Haghshenas; Nadia Barjaste; Sayyed Rasoul Mousavi
Addresses: Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84156, Iran ' Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84156, Iran ' Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan 84156, Iran; School of Computer Science, Institute for Research in Fundamental Sciences, Tehran 19538-33511, Iran
Abstract: Studying haplotypes is an important approach for investigation of genetic variations in the human genome because they contain a lot of information related to these types of variations. The haplotype assembly problem is to reconstruct two haplotypes for an individual using a set of aligned single nucleotide polymorphism (SNP) fragments from the two haplotypes (related to a particular chromosome). This problem is recognised as an NP-hard problem due to possible sequencing errors. Therefore, in practice, heuristic algorithms are used for finding satisfactory solutions to this problem. In this paper, an optimised reimplementation of HapSAT algorithm has been used to find haplotypes for HuRef dataset. Finding more accurate haplotypes based on this dataset is of considerable importance, because HuRef haplotypes are widely used in some researches (in biology, medicine, and pharmacy). Since the HapSAT algorithm provides significantly superior results compared to previously proposed algorithms, assembled haplotypes using HapSAT algorithm will be very useful for future researches.
Keywords: haplotype assembly; SNP; single nucleotide polymorphism; minimum error correction; MEC model; heuristics; Max-SAT; HuRef dataset; single individual haplotyping; HapSAT algorithm; genetic variations; haplotypes.
International Journal of Functional Informatics and Personalised Medicine, 2014 Vol.4 No.3/4, pp.274 - 285
Received: 07 May 2013
Accepted: 23 Mar 2014
Published online: 19 Mar 2015 *