Evaluation of predictive models based on random forest, decision tree and support vector machine classifiers and virtual screening of anti-mycobacterial compounds
by Madhulata Kumari; Neeraj Tiwari; Naidu Subbarao; Subhash Chandra
International Journal of Computational Biology and Drug Design (IJCBDD), Vol. 10, No. 3, 2017

Abstract: Three machine learning classifiers: random forest, decision tree and support vector machine were used to build predictive models of an anti-mycobacterial ChEMBL database and evaluated for their predictive capability. Before the development of predictive models, data pre-processing was carried out to fix the class imbalance problem by applying cost-sensitive classifier, and filtration of data instance by supervised synthetic minority oversampling technique (SMOTE), spread subsample and resample method. The statistical evaluation indicated that random forest model was the best model as it showed the best accuracy 93.83%, specificity 90.5%, receiver operating characteristic (ROC) 0.984, MCC 0.772 and kappa statistics 0.768 in comparison to other models whereas LibSVM showed the highest sensitivity 94.4% compared with others. Additionally, toxicity predictive models based on SingleCellcall DSSTox carcinogenicity database (AID1189) was developed which resulted in random forest model as the best model. The deployment of both RF predictive models on two unknown datasets resulted in 1317 compounds out of 1554 approved drugs and 2234 compounds out of 18,746 ChEMBL anti-malarial dataset as non-toxic and anti-mycobacterial compounds. Thus machine learning models present highly efficient methods to find out novel hit anti-mycobacterial compounds. We suggest that such machine learning techniques could be very useful to screen drug candidates not only for tuberculosis but also for other diseases.

Online publication date: Tue, 25-Jul-2017

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Computational Biology and Drug Design (IJCBDD):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?

Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com