Title: Machine learning methods for predicting the biological activities of molecules in high diverse databases

Authors: Faisal Saeed

Addresses: College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia

Abstract: In-silico drug discovery methods use the principle of similar property, which indicates that similar biological activities are exhibited in structurally similar compounds. Therefore, new drugs were discovered using the biological activities prediction methods that depend on the structures of chemical compounds. Several computational methods have been used for this purpose. However, the previous studies showed that the prediction of biological activities for heterogeneous molecules is still a challenge. This paper used several machine methods and different combinations of ensemble learning methods to enhance the performance of predicting molecular activities. In this study, a heterogeneous subset from the MDL Drug Data Report (MDDR) dataset has been used. The results showed the performances of several methods, which have been discussed to recommend the best machine learning and ensemble methods for this kind of diverse chemical datasets.

Keywords: biological activities; chemical compounds; chemical informatics; ensemble methods; machine learning methods.

DOI: 10.1504/IJICT.2022.124833

International Journal of Information and Communication Technology, 2022 Vol.21 No.2, pp.170 - 180

Received: 17 Oct 2020
Accepted: 05 Nov 2020

Published online: 09 Aug 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article