Title: Hyperlink induced topic search-based method to predict essential proteins

Authors: Bihai Zhao; Sai Hu; Zhihong Zhang; Changmei Xu; Xiwei Tang

Addresses: School of Computer Engineering and Applied Mathematics, Changsha University, Changsha 410022, Hunan Sheng, China; Hunan Provincial Key Laboratory of Nutrition and Quality Control of Aquatic Animals, School of Biological and Environmental Engineering, Changsha University, Changsha 410022, Hunan Sheng, China ' School of Computer Engineering and Applied Mathematics, Changsha University, Changsha 410022, Hunan Sheng, China ' School of Computer Engineering and Applied Mathematics, Changsha University, Changsha 410022, Hunan Sheng, China ' School of Computer Engineering and Applied Mathematics, Changsha University, Changsha 410022, Hunan Sheng, China ' School of Information Science and Engineering, Hunan First Normal University, Changsha 410205, Hunan Sheng, China

Abstract: Predicting essential proteins helps us to understand the minimum requirements for cell survival and development. Benefiting from large-scale Protein-Protein interaction (PPI) data, many computation-based methods have been designed to identify essential proteins from PPI networks. Unfortunately, PPI data is incomplete and faulty due to the limitations of experimental conditions and techniques. More and more researchers focus on the prediction of essential proteins by integrating PPI networks and multiple biological data. It is still challenging to improve the prediction accuracy of the computational methods. In this work, a novel essential proteins prediction method is proposed based on Hyperlink Induced Topic Search (HITS) algorithm. To reduce the negative impact of false positives on prediction, a weighted network is constructed by integrating the PPI network and gene expression profile, firstly. And then, an improved random walk algorithm based on HITS is employed on the weighted network to calculate authority scores and hub scores of proteins iteratively. Finally, top K proteins are selected as essential proteins according to their ranking scores, which are derived from their authority scores and hub scores in steady state. The experimental results show that the proposed method obviously outperforms other competing essential proteins prediction methods.

Keywords: essential proteins; protein-protein interaction; HITS; hyperlink induced topic search; random walk.

DOI: 10.1504/IJDMB.2019.100627

International Journal of Data Mining and Bioinformatics, 2019 Vol.22 No.3, pp.250 - 264

Received: 22 May 2019
Accepted: 23 May 2019

Published online: 05 Jul 2019 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article