Title: Cross analysis of whole genome deep sequencing data reveals over-presence of nonintronic insertions and deletions (INDELs)
Authors: Yongsheng Bai; Joshua Stolz; Cameron Meyer
Addresses: Department of Biology, Indiana State University, 600 Chestnut Street, Terre Haute, IN 47809, USA ' Department of Biology, Indiana State University, 600 Chestnut Street, Terre Haute, IN 47809, USA ' Department of Biology, Indiana State University, 600 Chestnut Street, Terre Haute, IN 47809, USA
Abstract: The advent of high-throughput whole genome/exome sequencing has provided opportunities in studying genetic variations and diseases at the individual nucleotide level. A number of approaches have been published describing methods that align next-generation sequencing reads and identify genetic variants. However, existing variant calling and annotation algorithms have limitations in reporting true variants for an individual genome due to their insufficient functionalities and features. Given the high importance of variants that map to exons and promoters or to nonintronic variants, we developed a ranking score function (R-Score) pipeline to prioritise nonintronic variants across multiple samples or individuals by taking into consideration sequencing coverage, homozygosity, and base calling quality of identified variants. By analysing deep high-throughput sequencing data for 17 members of the Coriell CEPH/UTAH 1463 family pedigree using the R-Score pipeline, we examined the prevalence of scored nonintronic insertions and deletions (INDELs) across multiple individuals.
Keywords: whole genome sequencing; single nucleotide polymorphisms; SNPs; nonintronic insertions; nonintronic deletions; nonintronic variants; variant calling format; family pedigree; computational biology; deep sequencing data.
International Journal of Computational Biology and Drug Design, 2015 Vol.8 No.4, pp.348 - 358
Received: 05 May 2015
Accepted: 30 May 2015
Published online: 15 Dec 2015 *