Authors: Madhulata Kumari; Subhash Chandra
Addresses: Department of Information Technology, Kumaun University, SSJ Campus, Almora, Uttarakhand 263601, India ' Department of Botany, Kumaun University, SSJ Campus, Almora, Uttarakhand 263601, India
Abstract: Machine learning techniques have been widely used in drug discovery and development in the areas of cheminformatics. Aspartyl aminopeptidase (M18AAP) of Plasmodium falciparum is crucial for survival of malaria parasite. We have created predictive models using weka and evaluated their performance based on various statistical parameters. Random Forest based model was found to be the most specificity (97.94%), with best accuracy (97.3%), MCC (0.306) as well as ROC (86.1%). The accuracy and MCC of these models indicated that they could be used to classify huge dataset of unknown compounds to predict their antimalarial compounds to develop effective drugs. Further, we deployed best predictive model on NCI diversity set IV. As result we found 59 bioactive anti-malarial molecules inhibiting M18AAP. Further, we obtained 18 non-toxic hit molecules out of 59 bioactive compounds. We suggest that such machine learning approaches could be applied to reduce the cost and length of time of drug discovery.
Keywords: machine learning; data mining; weka; random forest; naive Bayes; J48; toxicity prediction; malaria; drug discovery; in silico prediction; anti-malarial hit molecules; malaria drugs; predictive modelling.
International Journal of Computational Biology and Drug Design, 2015 Vol.8 No.1, pp.40 - 53
Received: 20 Oct 2014
Accepted: 18 Nov 2014
Published online: 07 Apr 2015 *