Title: When to choose an ensemble classifier model for data mining

Authors: Mordechai Gal-Or, Jerrold H. May, William E. Spangler

Addresses: Palumbo-Donahue School of Business, Duquesne University, Pittsburgh, 15282, PA, USA. ' Joseph M. Katz Graduate School of Business, University of Pittsburgh, Pittsburgh, 15260, PA, USA. ' Palumbo-Donahue School of Business, Duquesne University, Pittsburgh, 15282, PA, USA

Abstract: This study empirically explores the use of a group, or ensemble, of classifiers to support managerial decision making in domains characterised by asymmetric misclassification costs. The approach developed in this study is intended to assist a decision maker in determining whether a current situation warrants the choice of an ensemble over an individual classifier. The decision is based primarily on misclassification costs in the decision context and the associated basis on which performance is assessed. We show that the criteria for evaluating classifier performance are fundamentally dependent on the symmetry or asymmetry of misclassification costs. The result of this study is a set of heuristics for identifying highly- and poorly-performing ensembles.

Keywords: data mining; classification costs; multiple classifiers; classifier ensembles; classifier groups; decision making; misclassification costs.

DOI: 10.1504/IJBIDM.2010.033364

International Journal of Business Intelligence and Data Mining, 2010 Vol.5 No.3, pp.297 - 318

Received: 24 Sep 2008
Accepted: 26 Mar 2009

Published online: 01 Jun 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article