Title: A mixture language model for the classification of Chinese online reviews

Authors: Ming Jiang; Jian Wang; Xingqi Wang; Jingfan Tang; Chunming Wu

Addresses: Institute of Software and Intelligent Technology, Hangzhou Dianzi University, Hangzhou, 310018, China; Zhejiang Provincial Engineering Center on Media Data Cloud Processing and Analysis, Hangzhou Dianzi University, Hangzhou, 310018, China ' Institute of Software and Intelligent Technology, Hangzhou Dianzi University, Hangzhou, 310018, China; Zhejiang Provincial Engineering Center on Media Data Cloud Processing and Analysis, Hangzhou Dianzi University, Hangzhou, 310018, China ' Institute of Software and Intelligent Technology, Hangzhou Dianzi University, Hangzhou, 310018, China; Zhejiang Provincial Engineering Center on Media Data Cloud Processing and Analysis, Hangzhou Dianzi University, Hangzhou, 310018, China ' Institute of Software and Intelligent Technology, Hangzhou Dianzi University, Hangzhou, 310018, China; Zhejiang Provincial Engineering Center on Media Data Cloud Processing and Analysis, Hangzhou Dianzi University, Hangzhou, 310018, China ' College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China

Abstract: In this essay, we propose an unsupervised topic and sentiment mixture model (LSS model) mainly on the basis of LDA model and combining with the sentiment factor, as an approach to sentiment classification of online reviews. There is an extra constraint that all words in a sentence are generated from one topic and one sentiment. Which means the hypothesis of the model is a word in a sentence is of one meaning only. LSS model is totally unsupervised and it needs no labelled corpora or any other sentiment seed words. Experiments show that LSS model performs better than JST model and ASUM model. The F-1 value of sentiment classification is 8% higher than ASUM model and 12.5% than JST model.

Keywords: topic modelling; latent Dirichlet allocation; sentiment classification; online reviews; China; language models; unsupervised topic and sentiment mixture model.

DOI: 10.1504/IJICT.2015.066017

International Journal of Information and Communication Technology, 2015 Vol.7 No.1, pp.109 - 122

Received: 24 Aug 2013
Accepted: 15 May 2014

Published online: 30 Nov 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article