Title: Deletion genotype calling on the basis of sequence visualisation and image classification

Authors: Jing Wang; Jingyang Gao; Cheng Ling

Addresses: College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China ' College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China ' College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China

Abstract: Widely known genotype calling methods, such as CNVnator, Pindel, and LUMPY, are restricted in terms of detectable length ranges and sequence coverage. Focusing on deletions larger than 50 bp, we propose a new approach with two main steps: (1) visualising images of deletions and (2) conducting deletion genotypes classification. Given the coordinates of candidates, this method first generates breakpoint images by fetching reads from BAM files. Convolutional neural networks then perform genotype recognition. We test our approach on both low and high coverage simulated noisy data and compare the results to those of CNVnator, Pindel, and LUMPY. The results indicate our approach surpasses other tools with higher accuracy, wider detectable deletion length range, and better performance on both low and high coverage data. To summarise, our approach not only provides an intuitive image view of deletion regions, but also achieves better results for genotype calling compared to existing tools.

Keywords: deletion; genotype calling; convolutional neural network; visualisation; image classification.

DOI: 10.1504/IJDMB.2018.093682

International Journal of Data Mining and Bioinformatics, 2018 Vol.20 No.2, pp.109 - 122

Received: 21 Apr 2018
Accepted: 03 May 2018

Published online: 31 Jul 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article