Authors: Akula V.S. Siva Rama Rao; P. Ranjana
Addresses: Department of Computer Science and Engineering, Hindustan Institute of Technology and Science, Chennai, India; Department of CSE, SITE, Tadepalligudem, AP, India ' Department of Computer Science and Engineering, Hindustan Institute of Technology and Science, Chennai, India
Abstract: Many government schemes were unsuccessful because lack of proper feedback on the ongoing schemes, where billion dollars investment is going to be in vain. Sentiment analysis is one of best approach to analyse opinions of the peoples on various government schemes. Sentiment analysis and machine learning techniques emerged to analyse huge social media corpora to track people's views on government policies, products and services. Sentiment analysis process consists of various phases which include data discovery, data collection, data pre-processing, and data analysis. Stemming is a process to generate the morphemes in natural language sentences for various applications such as sentiment analysis, information retrieval, and domain analysis. The stemming process involved two major errors, which are over-stemming and under-stemming errors. Most of sentiment analysis natural languages processing applications used Lancaster and Porter stemming algorithms where more than one word inflected into same morpheme, which causes the etymology behaviour of the stemming word and prone to classify the tweets false positives and false negative. The proposed un-prejudice light stemming algorithm prevent etymology behaviour of morpheme and sustain its meaning during stemming process by selecting a word which has maximum number of synonyms in lexical database.
Keywords: empower government; government schemes; NLP; social media networks; under-stemming; over-stemming; stemmer weight; sentiment analysis.
Electronic Government, an International Journal, 2020 Vol.16 No.1/2, pp.118 - 136
Received: 13 Mar 2019
Accepted: 21 May 2019
Published online: 22 Feb 2020 *