Title: Integration of k-means clustering algorithm with network analysis for drug-target interactions network prediction

Authors: Sara Aghakhani; Ala Qabaja; Reda Alhajj

Addresses: Department of Computer Science, University of Calgary, Calgary, Alberta, Canada ' Department of Computer Science, University of Calgary, Calgary, Alberta, Canada ' Department of Computer Science, University of Calgary, Calgary, Alberta, Canada

Abstract: Prediction of the interactions between drugs and target proteins is an important factor in silico drug discovery. The number of known interactions is very small in comparison to the potential number of interactions. In this paper, a new method is proposed which combines data from both chemical structures and genomic sequence data. This method uses both supervised and unsupervised learning, as well as network analysis techniques. The proposed approach integrates k-means clustering algorithm with Social Network Analysis (SNA) techniques for a novel prediction of drug-target interactions. Here, we demonstrate the performance of our approach in the prediction of drug-target interactions by using four classes of drug-target interaction networks in human; enzymes, ion channels, G protein-coupled receptors (GPCRs), and nuclear receptors. The AUC curve is used to evaluate the accuracy of the proposed approach using three classifiers; Bayes Network, Naïve Bayes and SVM. We could identify novel drug-protein interactions using the Bayes network classifier. The reported accuracy for enzymes, ion channels, GPCRs, and nuclear receptors are 98%, 85%, 98.6% and 99.2%.

Keywords: k-means; clustering; network analysis; drug-protein interactions; network prediction; classification; support vector machine.

DOI: 10.1504/IJDMB.2018.094776

International Journal of Data Mining and Bioinformatics, 2018 Vol.20 No.3, pp.185 - 212

Accepted: 07 Apr 2018
Published online: 15 Sep 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article