Title: Dictionary-based sentiment analysis of Hinglish text and comparison with machine learning algorithms

Authors: Harpreet Kaur; Veenu Mangat; Nidhi Krail

Addresses: University Institute of Engineering & Technology (UIET), Panjab University, Chandigarh, India ' University Institute of Engineering & Technology (UIET), Panjab University, Chandigarh, India ' University Institute of Engineering & Technology (UIET), Panjab University, Chandigarh, India

Abstract: With the recent development of web 2.0, there has been a lot of increase in social networking and online marketing sites. The data obtained from these sites are analysed for better human decision making. Sentiment analysis involves extraction of sentiments from reviews. Sentiments can be positive, negative or neutral. Most of the content on the internet is in the English language, but with the improved awareness of people, data in other languages is increasing gradually. Not much work has been done in Indian languages. Hinglish, a mixture of Hindi and English, is an informal language which is exceptionally famous in India as individuals feel greater talking in their language. In this paper, we present a dictionary-based approach for Hinglish text classification. We also implemented traditional machine learning classification algorithms such as SVM, NB and ME for comparison. It is found that for Hinglish text, dictionary-based classification gives best accuracy results.

Keywords: sentiment classification; sentiment analysis; feature extraction; dictionary; Hinglish text; Wordnet; SentiWordnet.

DOI: 10.1504/IJMSO.2017.090759

International Journal of Metadata, Semantics and Ontologies, 2017 Vol.12 No.2/3, pp.90 - 102

Received: 17 Mar 2017
Accepted: 12 Sep 2017

Published online: 27 Mar 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article