Title: An evolutionary algorithm for global induction of regression and model trees

Authors: Marcin Czajkowski; Marek Kretowski

Addresses: Faculty of Computer Science, Bialystok University of Technology, Wiejska 45a, 15-351 Bialystok, Poland ' Faculty of Computer Science, Bialystok University of Technology, Wiejska 45a, 15-351 Bialystok, Poland

Abstract: Most tree-based algorithms are typical top-down approaches that search only for locally optimal decisions at each node and does not guarantee the globally optimal solution. In this paper, we would like to propose a new evolutionary algorithm for global induction of univariate regression trees and model trees that associate leaves with simple linear regression models. The general structure of our solution follows a typical framework of evolutionary algorithms with an unstructured population and a generational selection. We propose specialised genetic operators to mutate and cross-over individuals (trees), fitness function that base on the Bayesian information criterion and smoothing process that improves the prediction accuracy of the model tree. Performed experiments on 15 real-life datasets show that proposed solution can be significantly less complex with at least comparable performance to the classical top-down counterparts.

Keywords: evolutionary algorithms; regression trees; model trees; SLR; linear regression; Bayesian information criterion; BIC; regression modelling.

DOI: 10.1504/IJDMMM.2013.055865

International Journal of Data Mining, Modelling and Management, 2013 Vol.5 No.3, pp.261 - 276

Published online: 29 Jul 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article