Title: Extracting compact representation of knowledge from gene expression data for protein-protein interaction

Authors: Haohan Wang; Aman Gupta; Ming Xu

Addresses: School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA ' School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA ' Research Institute of Information Technology, Tsinghua University, Beijing, China

Abstract: DNA microarrays help measure the expression levels of thousands of genes concurrently. A major challenge is to extract biologically relevant information and knowledge from massive amounts of microarray data. In this paper, we explore learning a compact representation of gene expression profiles by using a multi-task neural network model, so that further analyses can be carried out more efficiently on the data. The proposed network is trained with prediction tasks for Protein-Protein Interactions (PPIs), predicting Gene Ontology (GO) similarities as well as geometrical constrains, while simultaneously learning a high-level representation of gene expression data. We argue that deep networks can extract more information from expression data as compared to standard statistical models. We tested the utility of our method by comparing its performance with famous feature extraction and dimensionality reduction methods on the task of PPI prediction, and found the results to be promising.

Keywords: feature extraction; knowledge representation; deep learning; computational biology; convolutional neural network; multi-task network; PPI prediction; gene expression.

DOI: 10.1504/IJDMB.2017.085711

International Journal of Data Mining and Bioinformatics, 2017 Vol.17 No.4, pp.279 - 292

Received: 18 Sep 2016
Accepted: 17 Apr 2017

Published online: 08 Aug 2017 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article