Title: Comparing ensemble learning methods based on decision tree classifiers for protein fold recognition

Authors: Mahshid Khatibi Bardsiri; Mahdi Eftekhari

Addresses: Department of Computer and Electronic, Kerman Graduate University of Technology, Kerman, Iran ' Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran

Abstract: In this paper, some methods for ensemble learning of protein fold recognition based on a decision tree (DT) are compared and contrasted against each other over three datasets taken from the literature. According to previously reported studies, the features of the datasets are divided into some groups. Then, for each of these groups, three ensemble classifiers, namely, random forest, rotation forest and AdaBoost.M1 are employed. Also, some fusion methods are introduced for combining the ensemble classifiers obtained in the previous step. After this step, three classifiers are produced based on the combination of classifiers of types random forest, rotation forest and AdaBoost.M1. Finally, the three different classifiers achieved are combined to make an overall classifier. Experimental results show that the overall classifier obtained by the genetic algorithm (GA) weighting fusion method, is the best one in comparison to previously applied methods in terms of classification accuracy.

Keywords: protein fold recognition; decision trees; ensemble learning; ensemble classifiers; random forest; rotation forest; AdaBoost.M1; genetic algorithms; GA weighting fusion; classification accuracy; bioinformatics.

DOI: 10.1504/IJDMB.2014.057776

International Journal of Data Mining and Bioinformatics, 2014 Vol.9 No.1, pp.89 - 105

Received: 20 Jul 2011
Accepted: 01 Feb 2012

Published online: 21 Oct 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article