Title: Efficient methods for hierarchical multi-omic feature extraction and visualisation

Authors: Timothy Becker; Dong-Guk Shin

Addresses: Department of Computer Science and Engineering, University of Connecticut, Storrs, Connecticut, USA ' Department of Computer Science and Engineering, University of Connecticut, Storrs, Connecticut, USA

Abstract: A single DNA alignment file can be resource intensive to visualise at arbitrary scale given current visualisation systems. We address this limitation by integrating a parallel out-of-core feature extraction algorithm with a disk based hierarchical data store that is several orders of magnitude faster for visualisation tasks. To demonstrate the utility of our approach, we designed a high-performance web application that serves translated data to an interactive client. We incorporate novel visualisation of these data features, while allowing user-specified resolution and response. Unlike per-read techniques which can run out of memory when displaying large scale genomic variations, our data structure returns a controllable representation of that region, making the technique ideally suited for visualisation of multiple large data sets. We describe our open-source feature extraction framework and web-based visualization while comparing the performance to current systems.

Keywords: feature extraction; sequence alignment visualisation.

DOI: 10.1504/IJDMB.2020.108699

International Journal of Data Mining and Bioinformatics, 2020 Vol.23 No.4, pp.285 - 298

Received: 28 Mar 2020
Accepted: 02 Apr 2020

Published online: 27 Jul 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article