New under-sampling methods to address the problem of unbalanced sentiment classification: application on Arabic datasets
by Asmaa Mountassir; Houda Benbrahim; Ilham Berrada
International Journal of Information and Communication Technology (IJICT), Vol. 9, No. 1, 2016

Abstract: This paper presents the study we have carried out to address the problem of unbalanced datasets in supervised sentiment classification in an Arabic context. We propose three different methods to under-sample the majority class documents. Our goal is to compare the effectiveness of the proposed methods with the common random under-sampling. We also aim to evaluate the behaviour of the classifier toward different under-sampling rates. We use three different common classifiers, namely Naïve Bayes, support vector machines and k-nearest neighbours. The experiments are carried out on two different Arabic datasets that we have built internally. We show that results obtained on the first dataset, which is slightly skewed, are better than those obtained on the second one which is highly skewed. We conclude also that Naïve Bayes is sensitive to dataset size, the more we reduce the data the more the results degrade. However, support vector machines are highly sensitive to unbalanced datasets. We record an instable behaviour of k-nearest neighbour. The results show also that we can rely on the proposed techniques and that they are typically competitive with random under-sampling.

Online publication date: Wed, 13-Jul-2016

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

 
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Information and Communication Technology (IJICT):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?


Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email subs@inderscience.com