Title: Statistical analysis for aggregated count data in genetic association studies

Authors: Haewon Choi; Hye-Young Jung; Taesung Park

Addresses: Department of Statistics, Seoul National University, Gwanak-gu, Seoul 151-747, South Korea ' Department of Statistics, Seoul National University, Gwanak-gu, Seoul 151-747, South Korea ' Department of Statistics, Seoul National University, Gwanak-gu, Seoul 151-747, South Korea

Abstract: In smoking behaviour studies, Cigarette Counts Per Day (CPD) are aggregated such as 0, one pack, two packs, etc. Analysis of such count data is a challenge, owing to its reporting bias and difficulty in estimating its appropriate distribution. In this study, we set forth to identify genetic variants, such as Single Nucleotide Polymorphisms (SNPs), that correlate with aggregated count data, such as CPD. We first reviewed the existing approaches, in which the aggregated count data is a dependent variable and the SNP is an ordinal independent variable. We then considered a calibration model in which the SNP is the ordinal dependent variable and the aggregated count data is the independent variable. This calibration modelling approach becomes robust to accommodate distributional assumptions of count data. We applied our robust calibration modelling approach to CPD data from the Korean Association Resource project data of 4183 male samples. Through simulation studies, we investigated the performance of the proposed method for comparison to other competing approaches.

Keywords: self-reported studies; statistical analysis; aggregated count data; genetic association studies; CPD; counts per day; calibration modelling; genetic variants; single nucleotide polymorphisms; SNPs; simulation; bioinformatics.

DOI: 10.1504/IJDMB.2016.079802

International Journal of Data Mining and Bioinformatics, 2016 Vol.16 No.1, pp.77 - 91

Received: 17 May 2016
Accepted: 01 Jun 2016

Published online: 14 Oct 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article