Title: An evaluation of allele frequency estimation accuracy using pooled sequencing data

Authors: Yan Guo; Qiuyin Cai; Chun Li; Jiang Li; Chung-I Li; Regina Courtney; Wei Zheng; Jirong Long

Addresses: Department of Cancer Biology, Vanderbilt University, Nashville TN 37232, USA ' Epidemiology Centre, Vanderbilt University, Nashville TN 37232, USA ' Department of Biostatistics, Vanderbilt University, Nashville TN 37232, USA ' Center for Quantitative Sciences, Vanderbilt University, Nashville TN 37232, USA ' Department of Applied Mathematics, National Chiayi University, Chiayi, Taiwan ' Epidemiology Centre, Vanderbilt University, Nashville TN 37232, USA ' Epidemiology Centre, Vanderbilt University, Nashville TN 37232, USA ' Epidemiology Centre, Vanderbilt University, Nashville TN 37232, USA

Abstract: Next generation sequencing technology has matured, and with its current affordability, will replace the SNP chip as the genotyping tool of choice. Even with the current affordability of NGS, large scale studies will require careful study design to reduce cost. In this study, we designed an experiment to assess the accuracy of allele frequency estimated from pooled sequencing data. We compared the allele frequency estimated from sequencing data with the allele frequency estimated from individual SNP chip data and observed high correlations between them. However, by calculating error rate, we found that many SNPs had their allele frequency estimated from sequencing data significantly different from allele frequency estimated from SNP chip data. In conclusion, we found correlation is not an ideal measurement for comparing allele frequencies. And for the purpose of estimating allele frequency, we do not recommend using pooling with NGS as a cheaper alternative to genotype each sample individually.

Keywords: next generation sequencing; high throughput sequencing; DNA pooling; illumine; minor allele frequency; allele frequency estimation; estimation accuracy; error rate.

DOI: 10.1504/IJCBDD.2013.056709

International Journal of Computational Biology and Drug Design, 2013 Vol.6 No.4, pp.279 - 293

Received: 10 Jul 2012
Accepted: 03 Sep 2012

Published online: 27 Aug 2013 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article