Title: Predicting the secondary structure of proteins using machine learning algorithms
Authors: Rui Camacho; Rita Ferreira; Natacha Rosa; Vânia Guimarães; Nuno A. Fonseca; Vítor Santos Costa; Miguel De Sousa; Alexandre Magalhães
Addresses: LIAAD&DEI-Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' Faculdade de Engenharia da Universidade do Porto, Rua Dr Roberto Frias s/n, 4420-465 Porto, Portugal ' CRACS-INESC Porto L.A./FCUP, R. Campo Alegre 1021/1055, 4169-007 Porto, Portugal ' CRACS-INESC Porto L.A./FCUP, R. Campo Alegre 1021/1055, 4169-007 Porto, Portugal ' REQUIMTE/Universidade do Porto, R. Campo Alegre 687, 4169-007 Porto, Portugal ' REQUIMTE/Universidade do Porto, R. Campo Alegre 687, 4169-007 Porto, Portugal
Abstract: The functions of proteins in living organisms are related to their 3-D structure, which is known to be ultimately determined by their linear sequence of amino acids that together form these macromolecules. It is, therefore, of great importance to be able to understand and predict how the protein 3D- structure arises from a particular linear sequence of amino acids. In this paper we report the application of Machine Learning methods to predict, with high values of accuracy, the secondary structure of proteins, namely α-helices and ß-sheets, which are intermediate levels of the local structure.
Keywords: data mining; machine learning; classification; decision trees; rule induction; instance based learning; Bayesian algorithms; WEKA; bioinformatics; protein folding; secondary structure; protein structure; structure conformations; amino acids; structure prediction.
DOI: 10.1504/IJDMB.2012.050265
International Journal of Data Mining and Bioinformatics, 2012 Vol.6 No.6, pp.571 - 584
Published online: 17 Dec 2014 *
Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article