Authors: Marcin Czajkowski; Marek Kretowski
Addresses: Faculty of Computer Science, Bialystok University of Technology, Wiejska 45a, 15-351 Bialystok, Poland ' Faculty of Computer Science, Bialystok University of Technology, Wiejska 45a, 15-351 Bialystok, Poland
Abstract: Most tree-based algorithms are typical top-down approaches that search only for locally optimal decisions at each node and does not guarantee the globally optimal solution. In this paper, we would like to propose a new evolutionary algorithm for global induction of univariate regression trees and model trees that associate leaves with simple linear regression models. The general structure of our solution follows a typical framework of evolutionary algorithms with an unstructured population and a generational selection. We propose specialised genetic operators to mutate and cross-over individuals (trees), fitness function that base on the Bayesian information criterion and smoothing process that improves the prediction accuracy of the model tree. Performed experiments on 15 real-life datasets show that proposed solution can be significantly less complex with at least comparable performance to the classical top-down counterparts.
Keywords: evolutionary algorithms; regression trees; model trees; SLR; linear regression; Bayesian information criterion; BIC; regression modelling.
International Journal of Data Mining, Modelling and Management, 2013 Vol.5 No.3, pp.261 - 276
Received: 08 May 2021
Accepted: 12 May 2021
Published online: 13 Aug 2013 *