Title: Predicting the secondary structure of proteins using machine learning algorithms

Authors: Rui Camacho; Rita Ferreira; Natacha Rosa; Vânia Guimarães; Nuno A. Fonseca; Vítor Santos Costa; Miguel De Sousa; Alexandre Magalhães

Addresses: LIAAD&DEI-Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' CRACS-INESC Porto L.A./FCUP, R. Campo Alegre 1021/1055, 4169-007 Porto, Portugal ' CRACS-INESC Porto L.A./FCUP, R. Campo Alegre 1021/1055, 4169-007 Porto, Portugal ' REQUIMTE/Universidade do Porto, R. Campo Alegre 687, 4169-007 Porto, Portugal ' REQUIMTE/Universidade do Porto, R. Campo Alegre 687, 4169-007 Porto, Portugal

Abstract: The functions of proteins in living organisms are related to their 3-D structure, which is known to be ultimately determined by their linear sequence of amino acids that together form these macromolecules. It is, therefore, of great importance to be able to understand and predict how the protein 3D- structure arises from a particular linear sequence of amino acids. In this paper we report the application of Machine Learning methods to predict, with high values of accuracy, the secondary structure of proteins, namely α-helices and ß-sheets, which are intermediate levels of the local structure.

Keywords: data mining; machine learning; classification; decision trees; rule induction; instance based learning; Bayesian algorithms; WEKA; bioinformatics; protein folding; secondary structure; protein structure; structure conformations; amino acids; structure prediction.

DOI: 10.1504/IJDMB.2012.050265

International Journal of Data Mining and Bioinformatics, 2012 Vol.6 No.6, pp.571 - 584

Received: 08 May 2021
Accepted: 12 May 2021

Published online: 13 Nov 2012 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article