Title: Prioritisation of candidate Single Amino Acid Polymorphisms using one-class learning machines

Authors: Jiaxin Wu; Mingxin Gan; Rui Jiang

Addresses: MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China. ' School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China. ' MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China

Abstract: Recent advancements of the next-generation sequencing technology have enabled the direct sequencing of rare genetic variants in both case and control individuals. Although there have been a few statistical methods for uncovering potential associations between multiple rare variants and human inherited diseases, most of these methods require computational approaches to filter out non-functional variants for the purpose of maximising the statistical power. To tackle this problem, we formulate the detection of genetic variants that are associated with a specific type of disease from the perspective of one-class novelty learning. We focus on a typical type of genetic variants called Single Amino Acid Polymorphisms (SAAPs), and we take advantages of a feature selection mechanism and two one-class learning methods to prioritise candidate SAAPs. Systematic validation demonstrates that the proposed model is effective in recovering disease-associated SAAPs.

Keywords: rare variants; SAAPs; single amino acid polymorphisms; one-class SVMs; support vector machines; Parzen probabilistic neural networks; principal component analysis; PCA; sequencing; genetic variants; feature selection.

DOI: 10.1504/IJCBDD.2011.044446

International Journal of Computational Biology and Drug Design, 2011 Vol.4 No.4, pp.316 - 331

Published online: 24 Jan 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article