Title: CORE: core-based synthetic minority over-sampling and borderline majority under-sampling technique

Authors: Chumphol Bunkhumpornpat; Krung Sinapiromsaran

Addresses: Theoretical and Empirical Research Group, Department of Computer Science, Faculty of Science, Chiang Mai University, Chiang Mai 50200, Thailand ' Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand

Abstract: Class imbalance learning has recently drawn considerable attention among researchers. In this area, a rare class is the class of primary interest from the aim of classification. Unfortunately, traditional machine learning algorithms fail to detect this class because a huge majority class overwhelms a tiny minority class. In this paper, we propose a new technique called CORE to handle the class imbalance problem. The objective of CORE is to strengthen the core of a minority class and weaken the risk of misclassified minority instances nearby the borderline of a majority class. These core and borderline regions are defined by the applicability of a safe level. As a result, a minority class is more crowed and dominant. The experiment shows that CORE can significantly improve the predictive performance of a minority class when its dataset is imbalance.

Keywords: classification; class imbalance; synthetic minority over-sampling; borderline majority under-sampling.

DOI: 10.1504/IJDMB.2015.068952

International Journal of Data Mining and Bioinformatics, 2015 Vol.12 No.1, pp.44 - 58

Accepted: 03 Jun 2013
Published online: 22 Apr 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article