Authors: Huaping Guo; Tao Wei
Addresses: School of Computer and Information Technology, Xinyang Normal University, Xinyang, 464000, China ' Computer College, Henan Institute of Engineering, Zhengzhou, 450000, China
Abstract: Class-imbalance is very common in the real world. For the imbalanced class distribution, traditional state-of-the-art classifiers do not work well on imbalanced datasets. In this paper, we apply the well known statistical model logistic regression to imbalanced learning problem and, in order to improve its performance, we use cluster algorithms as the data pre-processing approach to partition majority class data to clusters. Then the logistic regression is learned on the corresponding rebalanced datasets. Experimental results show that, compared with other state-of-the art methods, the proposed one shows significantly better performance on measures of recall, g-mean, f-measure, AUC and accuracy.
Keywords: class imbalance; logistic regression; clustering.
International Journal of Computational Science and Engineering, 2019 Vol.18 No.1, pp.54 - 64
Received: 07 Mar 2017
Accepted: 15 Sep 2017
Published online: 14 Dec 2018 *