Title: OCLAR: logistic regression optimisation for Arabic customers' reviews

Authors: Marwan Al Omari

Addresses: Centre for Language Sciences and Communication, Celine Centre, Lebanese University, Tayouneh, Beirut, Lebanon

Abstract: The recognition and classification of sentiments in customer feedback are crucial for improving the service experience. Sentiment analysis (SA), as one of natural language processing (NLP) applications, evaluates customers' reviews by computing polarity text sequences. This paper extends the brute grid search methodology with pipeline architecture on OCLAR dataset. The architecture facilitates the search for inverse regularisation strength (IRS) parameter of logistic regression (LR). The experiments are evaluated on simultaneously different measures as like accuracy, area under the curve (AUC), etc. The experiments showed 0.10% improvement in AUC measure using bag-of-words (BoW) with n-gram levels of unigrams, bigrams, and trigrams, whereas 0.9% enhancement has been achieved by using term frequency and inverse document frequency (TF*IDF). In particular, the BoW classifier has achieved the largest improvement of 0.21% recall and 0.27% f-measure in negative predication, whereas it has achieved 0.03% precision and 0.01% f-measure in positive predication.

Keywords: Arabic sentiment analysis; ASA; natural language processing; NLP; Arabic NLP; machine learning; ML; logistic regression; LR; optimisation.

DOI: 10.1504/IJBIDM.2022.122177

International Journal of Business Intelligence and Data Mining, 2022 Vol.20 No.3, pp.251 - 273

Received: 23 Apr 2019
Accepted: 30 Jul 2019

Published online: 11 Apr 2022 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article