Title: Making an accurate classifier ensemble by voting on classifications from imputed learning sets
Authors: Xiaoyuan Su, Taghi M. Khoshgoftaar, Russell Greiner
Addresses: Computer Science and Engineering, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431,USA. ' Computer Science and Engineering, Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431,USA. ' Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
Abstract: Ensemble methods often produce effective classifiers by learning a set of base classifiers from a diverse collection of the training sets. In this paper, we present a system, voting on classifications from imputed learning sets (VCI), that produces those diverse training sets by randomly removing a small percentage of attribute values from the original training set, and then using an imputation technique to replace those values. VCI then runs a learning algorithm on each of these imputed training sets to produce a set of base classifiers. Later, the final prediction on a novel instance is the plurality classification produced by these classifiers. We investigate various imputation techniques here, including the state-of-the-art Bayesian multiple imputation (BMI) and expectation maximisation (EM). Our empirical results show that VCI predictors, especially those using BMI and EM as imputers, significantly improve the classification accuracy over conventional classifiers, especially on datasets that are originally incomplete; moreover VCI significantly outperforms bagging predictors and imputation-helped machine learners.
Keywords: machine learned classifiers; imputation techniques; incomplete data; ensemble classifiers; classification accuracy; training sets.
International Journal of Information and Decision Sciences, 2009 Vol.1 No.3, pp.301 - 322
Published online: 06 Aug 2009 *Full-text access for editors Access for subscribers Purchase this article Comment on this article