Title: Improving domain-based protein interaction prediction using biologically-significant negative dataset

Authors: Xiao-Li Li, Soon-Heng Tan, See-Kiong Ng

Addresses: Knowledge Discovery Department, Institute For Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore. ' Knowledge Discovery Department, Institute For Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore. ' Knowledge Discovery Department, Institute For Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore

Abstract: We propose a domain-based classification method to predict protein-protein interactions using probabilities of putative interacting domain pairs derived from both experimentally-determined interacting protein pairs and carefully-chosen non-interacting protein pairs. Multi-species comparative results for protein interaction prediction show that such careful generation of biologically-meaningful negative training data can improve classification performance.

Keywords: protein-protein interactions; protein interaction prediction; domain-domain interaction; biologically significant negative sets; randomly selected negative sets; classification; F-measure; bioinformatics; training data.

DOI: 10.1504/IJDMB.2006.010852

International Journal of Data Mining and Bioinformatics, 2006 Vol.1 No.2, pp.138 - 149

Published online: 07 Sep 2006 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article