Title: The effect of class imbalance, complexity, size, and learning distribution on classifier performance
Authors: Sofia Visa
Addresses: Department of Mathematics and Computer Science, College of Wooster, 1189 Beall Ave., Wooster, OH 44691, USA
Abstract: Classes of real world datasets have various properties (such as imbalance, size, complexity, and class distribution) that make the classification task more difficult. We investigate the robustness of six classification techniques over data having various combinations of the above mentioned properties. One artificial domain and six real world datasets are used in these experiments. Results of our analysis point to the frequency-based classifiers (such as the fuzzy and the Bayes classifiers) as being more robust over various imbalance, size, complexity, and training distribution.
Keywords: classification techniques; learning distribution; imbalance data; fuzzy sets; fuzzy logic; classifier performance; data complexity; size.
International Journal of Advanced Intelligence Paradigms, 2011 Vol.3 No.3/4, pp.341 - 366
Published online: 26 Mar 2015 *Full-text access for editors Access for subscribers Purchase this article Comment on this article