Authors: Rehab M. Duwairi; Islam Qarqaz
Addresses: Department of Computer Information Systems, Jordan University of Science and Technology, Irbid 22110, Jordan ' Department of Computer Science, Jordan University of Science and Technology, Irbid 22110, Jordan
Abstract: Sentiment analysis aims to determine the polarity that is embedded in people comments and reviews. Sentiment analysis is important for companies and organisations which are interested in evaluating their products or services. The current paper deals with sentiment analysis in Arabic reviews. Three classifiers were applied on an in-house developed dataset of tweets/comments. In particular, the Naïve Bayes, SVM and K-nearest neighbour classifiers were employed. This paper also addresses the effects of term weighting schemes on the accuracy of the results. The binary model, term frequency and term frequency inverse document frequency were used to assign weights to the tokens of tweets/comments. The results show that alternating between the three weighting schemes slightly affects the accuracies. The results also clarify that the classifiers were able to remove false examples (high precision) but were not that successful in identifying all correct examples (low recall).
Keywords: sentiment analysis; sentiment classification; opinion mining; polarity detection; supervised learning; text mining; Arabic language; Arabic reviews; tweets; user comments; user reviews; naive Bayes; SVM; support vector machines; K-nearest neighbour; kNN; term weighting.
International Journal of Data Mining, Modelling and Management, 2016 Vol.8 No.4, pp.369 - 381
Accepted: 10 Jul 2015
Published online: 29 Dec 2016 *