Authors: Nawaf A. Abdulla; Mahmoud Al-Ayyoub; Mohammed Naji Al-Kabi
Addresses: Department of Computer Science, Faculty of Computer and Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan ' Department of Computer Science, Faculty of Computer and Information Technology, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan ' Science and Information Technology Faculty, Zarqa University, P.O. Box 132222, Zarqa 13132, Jordan
Abstract: Due to the evolution of Web 2.0 technology, internet users are more capable of posting their comments and reviews to express their opinions and feelings about everything. Hence, the necessity of automatically identifying the polarity (be it positive, negative, or neutral) of these comments arose and new interdisciplinary field called sentiment analysis (SA) emerged. Unluckily, many studies were conducted on the English language whereas those on the Arabic language are quite few. In addition, the publicly available datasets and testing tools for SA of Arabic text are rare. In this paper, a relatively large dataset of Arabic comments is manually collected and annotated. The source is one of the most widely used social networks in the Arab world, Yahoo!-Maktoob. A comprehensive analysis of this dataset is presented and two popular classifiers, support vector machine (SVM) and Naive Bayes (NB) are used for empirical experimentations. The results show that SVM outperforms NB and achieves a 64% accuracy level.
Keywords: social networking; document-level sentiment analysis; Arabic text analysis; opinion mining; Arabic comments; support vector machine; SVM; naive Bayes.
International Journal of Big Data Intelligence, 2014 Vol.1 No.1/2, pp.103 - 113
Received: 01 Feb 2014
Accepted: 01 Apr 2014
Published online: 23 Jul 2014 *