Title: Decision trees for binary classification variables grow equally with the Gini impurity measure and Pearson's chi-square test

Authors: Johannes L. Grabmeier, Larry A. Lambe

Addresses: University of Applied Sciences Deggendorf, Edlmairstr. 6+8, D-94469, Deggendorf, Germany. ' Multidisciplinary Software Systems Research Corporation (MSSRC), P.O. Box 6667, Bloomingdale, IL 60108, USA

Abstract: We show that for binary classification variables, Gini and Pearson purity measures yield exactly the same tree, provided all the other parameters of the algorithms are identical. A counter-example for ternary classification variables is given.

Keywords: decision trees; Gini; impurity measure; Pearson; chi-square test; entropy; binary classification; variables; contingency matrix; power series expansion; symmetric polynomials; purity measures; ternary classification; data mining.

DOI: 10.1504/IJBIDM.2007.013938

International Journal of Business Intelligence and Data Mining, 2007 Vol.2 No.2, pp.213 - 226

Published online: 04 Jun 2007 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article