Title: Multi-descriptor approaches to oxygen binding proteins prediction and classification using deep learning
Authors: Soumiya Hamena; Souham Meshoul; Salima Ouadfel
Addresses: Laboratory of Modeling and Implementation of Complex Systems (MISC Laboratory), Computer Science Department, College of NTIC, Constantine 2 University, Abdelhamid Mehri, Biotechnology Research Center (CRBt) & CERIST, 25000 Constantine, Algeria ' IT Department, College of Computer and Information Sciences, PNU, 84428 Riyadh, Saudi Arabia ' Computer Science Department, Constantine 2 University, Abdelhamid Mehri, 25000 Constantine, Algeria
Abstract: Oxygen binding proteins play a key role in the transport and storage of oxygen through the body's cells. However, costly and time consuming biological tests can only determine a very small portion of all proteins available. This has made computational approaches increasingly essential to help biologists. The key idea behind this work is to investigate the effect of using several descriptors and deep learning to achieve better prediction. Three kinds of descriptors to represent protein sequences are considered namely amino acid composition (AAC), dipeptide composition (DC) and conjoint triad feature (CTF). Firstly, we applied deep neural networks (DNN) to single descriptor and then to multi-descriptor by investigation of the combination of descriptors. Secondly, DNN was developed to classify the oxygen binding proteins into six different classes. The experimental results show that the proposed prediction models outperform the existing methods where we obtained a Matthews correlation coefficient (MCC) of 0.9677.
Keywords: oxygen binding proteins; proteins function; AAC; DC; CTF; deep neural networks; deep learning; data mining.
International Journal of Bioinformatics Research and Applications, 2022 Vol.18 No.3, pp.191 - 218
Received: 19 Dec 2019
Accepted: 15 Jan 2021
Published online: 22 Aug 2022 *