Title: An efficient visualisation method for exploring latent patterns in large microbiome expression data sets

Authors: Weiwei Xu; Timothy Schultz; Rong Xie

Addresses: International School of Software, Wuhan University, Wuhan, Hubei, 430079, China ' College of Computing & Informatics, Drexel University, Philadelphia, PA 19104, USA ' International School of Software, Wuhan University, Wuhan, Hubei, 430079, China

Abstract: In recent years, HMP has provided analytical insights into the human microbiome to gain better insights into its effects on human health. Applying insights for therapeutic and biotechnological applications requires researchers efficiently classify the vast amount of microbes which comprise the microbiome. Since these datasets are sparse and complex in nature, the application of dimensionality reduction algorithms is a popular way for extracting latent phylogenetic themes. We introduce an Augmented Barnes-Hut t-SNE method, which is both more efficient in processing time and sensitive to subtle albeit meaningful variations in microbial classifications based on 5 high-level anatomical regions. We demonstrate that our method not only separates the microbiome into these regions, but can further elucidate them into 18 separate sample sites. It is contended that this approach can accurately resolve phylogenetic themes at varying levels of granularity, and anticipate its application in other research domains where complex high-dimensional datasets are prevalent.

Keywords: data visualisation; metagenomics; human microbiome project; HMP; Laplacian regularisation; non-negative matrix factorisation; augmented Barnes Hut t-SNE; vantage point tree; dimensionality reduction; approximate similarity search; multidimensional scaling; latent patterns; microbiome expression datasets; bioinformatics.

DOI: 10.1504/IJDMB.2016.076016

International Journal of Data Mining and Bioinformatics, 2016 Vol.15 No.1, pp.47 - 58

Received: 02 Nov 2015
Accepted: 13 Nov 2015

Published online: 21 Apr 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article