Authors: Mohammad Javad Shayegan; Mehrdad Molanorouzi
Addresses: Department of Computer Engineering, University of Science and Culture, Tehran, Iran ' Department of Computer Engineering, University of Science and Culture, Tehran, Iran
Abstract: Sentiment analysis in social media has grabbed more considerable attention because the results of such studies are highly applicable in social, economic, and political contexts. This study aimed to present an approach for collecting data from Twitter while storing and analysing the data using the hadoop as a big data platform as well as a hybrid trial and error model using the Bayes theorem plus a dictionary of words for the sentiment analysis. This method classifies tweets in two positive and negative classes based on the probability of positive words and negative words. According to the results, the accuracy of the proposed approach boosted from 67% to 71%. Then a new idea was employed in form of a weighted dictionary to achieve a higher accuracy. As such. the accuracy of the proposed approach reached a rate of 78% according to the results of another analysis conducted on the same data.
Keywords: twitter sentiment analysis; TSA; big data; lexicon sentiment analysis; hadoop.
International Journal of Web Based Communities, 2021 Vol.17 No.3, pp.149 - 162
Received: 19 Dec 2019
Accepted: 06 Jan 2021
Published online: 09 Jul 2021 *