Title: Predicting protein functions by using non-negative matrix factorisation with multi-networks co-regularisation

Authors: Wei Peng; Jielin Du; Lun Li; Wei Dai; Wei Lan

Addresses: Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan 650500, China ' Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China ' Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China ' Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China; Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan 650500, China ' School of Computer, Electronic and Information, Guangxi University, Nanning, Guangxi 530004, China

Abstract: It is a hot research field to design an effective method for protein function prediction by integrating heterogeneous biological data. In this work, we proposed a novel non-Negative Matrix Factorisation-based method, namely PONMF-S to learn protein and GO features from different biological networks for protein function prediction. Additionally, we extend PONMF-S to other versions by considering the function influence of proteins' neighbours and GO terms' neighbours. We apply our methods and two state-of-the-art methods (UBiRW and NMFGO) to predict functions for proteins of Saccharomyces cerevisiae and Homo sapiens. The prediction results show that PONMF-S outperforms the other two existing methods when randomly removing a part of known function information. When predicting functions for the proteins that have not any known ahead functional information, PONMF-S improves the prediction performance of NMFGO significantly and is comparable with UBiRW. Besides, the extended version of PONMF-S can even outperform UBiRW in most function categories.

Keywords: protein function prediction; regularised non-negative matrix factorisation; protein-protein interaction network; GO functional similarity.

DOI: 10.1504/IJDMB.2020.108702

International Journal of Data Mining and Bioinformatics, 2020 Vol.23 No.4, pp.318 - 342

Received: 01 May 2020
Accepted: 04 May 2020

Published online: 27 Jul 2020 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article