Title: Comparative study of balancing methods: case of imbalanced medical data

Authors: Sara Belarouci; Sarra Bouchikhi; Mohammed Amine Chikh

Addresses: Biomedical Engineering Laboratory, Tlemcen University, Tlemcen, Algeria ' Biomedical Engineering Laboratory, Tlemcen University, Tlemcen, Algeria ' Biomedical Engineering Laboratory, Tlemcen University, Tlemcen, Algeria

Abstract: Imbalanced learning problems contain unequal distribution of data samples among different classes, where most of the samples belong to some classes and the rest to the other classes. Learning from the imbalanced data is of utmost important to the research community as it is present in many vital real-world classification problems, such as medical diagnosis. There have been many works dealing with classification of imbalanced data sets. In medical data classification, we often face the imbalanced number of data samples where at least one of the classes constitutes only a very small minority of the data. In this paper, we proposed a learning method based on a cost-sensitive extension of Least Mean Square (LMS) algorithm that penalises errors of different samples with different weights and some rules of thumb to determine those weights. After the balancing phase, we apply different classifiers (Support Vector Machine [SVM], k-Nearest Neighbour [k-NN] and Multilayer Perceptron [MLP]) for the new balanced data set. We have also compared the results obtained by the LMS algorithm with the results obtained by the sampling techniques (under-sampling, oversampling and Synthetic Minority Oversampling Technique (SMOTE)).

Keywords: imbalanced medical data; least mean squares; LMS; undersampling; oversampling; SMOTE; multilayer perceptron; k-nearest neighbour; kNN; support vector machines; SVM; imbalanced learning; medical data classification; sampling techniques; classifiers.

DOI: 10.1504/IJBET.2016.078288

International Journal of Biomedical Engineering and Technology, 2016 Vol.21 No.3, pp.247 - 263

Received: 10 Jul 2015
Accepted: 02 Nov 2015

Published online: 14 Aug 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article