Title: Risk prediction of type 2 diabetes using common and rare variants

Authors: Sunghwan Bae; Taesung Park

Addresses: Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea ' Department of Statistics, Seoul National University, Seoul, South Korea

Abstract: The recent development of next generation sequencing technology has led to the identification of several disease-related genetic variants. In this study, we systematically compare the performance of prediction models using common and rare variants from the Whole Exome Sequencing data of the Type 2 Diabetes Genetic Exploration by Next generation sequencing in multi-ethnic samples. We evaluated several methods for predicting binary phenotypes such as Stepwise Logistic Regression, Penalised Regression and Support Vector Machine (SVM). We first constructed prediction models by combining variable selection and prediction methods for Type 2 Diabetes. We then calculated the Area Under the Curve (AUC) to compare the performance of the prediction models. The results indicate that the performance of the common and rare variants combination was better than either that of the common variants only or the rare variants only. Further, the AUC values of SVM were always larger than those of other prediction models.

Keywords: WES; whole exome sequencing; risk prediction model; T2D; type 2 diabetes; penalised regression methods; stepwise selection; SVM; support vector machine.

DOI: 10.1504/IJDMB.2018.092160

International Journal of Data Mining and Bioinformatics, 2018 Vol.20 No.1, pp.77 - 90

Received: 28 Feb 2018
Accepted: 10 Mar 2018

Published online: 05 Jun 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article