Title: Using implicitly and explicitly rated online customer reviews to build opinionated Arabic lexicons

Authors: Mohammad Daoud

Addresses: Department of Computer Science, The American University of Madaba, Madaba, Jordan

Abstract: Creating an opinionated lexicon is an important step towards a reliable social media analysis system. In this article we are proposing an approach and describing an experiment to build an Arabic polarised lexical database from analysing online implicitly and explicitly rated customer reviews. These reviews are written in modern standard Arabic and Palestinian/Jordanian dialect. Therefore, the produced lexicon contains casual slangs and dialectic entries used by the online community, which is useful for sentiment analysis of informal social media micro-blogs. We have extracted 28,000 entries from processing 15,100 reviews and by expanding the initial lexicon through Google translate. We calculated an implicit rating for every review driven by its text to address the problem of ambiguous opinions of certain online posts, where the text of the review does not match the given rating (the explicit rating). Each entry was given a polarity tag and a confidence score. High confidence scores have increased the precision of the polarisation process. Explicit rating has increased the coverage and confidence of polarity.

Keywords: polarised lexicon; social media analysis; opinion mining; term extraction.

DOI: 10.1504/IJDMMM.2019.098968

International Journal of Data Mining, Modelling and Management, 2019 Vol.11 No.2, pp.189 - 203

Received: 27 Sep 2017
Accepted: 09 Jun 2018

Published online: 20 Feb 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article