Title: A new algorithm for essential proteins identification based on the integration of protein complex co-expression information and edge clustering coefficient

Authors: Jiawei Luo; Juan Wu

Addresses: School of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China ' School of Information Science and Engineering, Hunan University, Changsha, Hunan, 410082, China

Abstract: Essential proteins provide valuable information for the development of biology and medical research from the system level. The accuracy of topological centrality only based methods is deeply affected by noise in the network. Therefore, exploring efficient methods for identifying essential proteins would be of great value. Using biological features to identify essential proteins is efficient in reducing the noise in PPI network. In this paper, based on the consideration that essential proteins evolve slowly and play a central role within a network, a new algorithm, named CED, is proposed. CED mainly employs gene expression level, protein complex information and edge clustering coefficient to predict essential proteins. The performance of CED is validated based on the yeast Protein-Protein Interaction (PPI) network obtained from DIP database and BioGRID database. The prediction accuracy of CED outperforms other seven algorithms when applied to the two databases.

Keywords: essential proteins; network topology; protein complex; gene co-expression level; protein identification; edge clustering coefficient; bioinformatics; PPI networks; protein-protein interaction.

DOI: 10.1504/IJDMB.2015.069654

International Journal of Data Mining and Bioinformatics, 2015 Vol.12 No.3, pp.257 - 274

Received: 16 Jul 2013
Accepted: 22 Mar 2014

Published online: 29 May 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article