Title: A feature-learning-based method for the disease-gene prediction problem

Authors: Lorenzo Madeddu; Giovanni Stilo; Paola Velardi

Addresses: Translational and Precision Medicine Department, Sapienza University of Rome, Rome, Italy ' Computer Science Department, University of L'Aquila, L'Aquila, Italy ' Computer Science Department, Sapienza University of Rome, Rome, Italy

Abstract: We predict disease-genes relations on the human interactome network using a methodology that jointly learns functional and connectivity patterns surrounding proteins. Contrary to other data structures, the interactome is characterised by high incompleteness and absence of explicit negative knowledge, which makes predictive tasks particularly challenging. To exploit at best latent information in the network, we propose an extended version of random walks, named Random Watcher-Walker (RW²), which is shown to perform better than other state-of-the-art algorithms. We also show that the performance of RW² and other compared state-of-the-art algorithms is extremely sensitive to the interactome used, and to the adopted disease categorisations, since this influences the ability to capture regularities in presence of sparsity and incompleteness.

Keywords: network medicine; disease gene prediction; disease gene prioritisation; node embedding; random walks; graph-based methods; biological networks; complex diseases.

DOI: 10.1504/IJDMB.2020.109502

International Journal of Data Mining and Bioinformatics, 2020 Vol.24 No.1, pp.16 - 37

Received: 04 Apr 2020
Accepted: 04 Apr 2020

Published online: 10 Sep 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article