Authors: Takahiro Hayashi, Koji Abe, Debabrata Roy, Rikio Onai
Addresses: Faculty of Engineering, Department of Information Engineering, Niigata University, 8050 ikarashi-2-no-cho, Nishiku, Niigata, 950-2181, Japan. ' Department of Informatics, School of Science and Engineering, Kinki University, 3-4-1, Kowakae, Higashi-Osaka, Osaka, 577-8502, Japan. ' Department of Electrical Engineering, Bengal Engineering and Science University, 14 Gobra Road, Kolkata 700 046, India. ' Faculty of Electro-Communications, Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, Japan
Abstract: This paper presents a method for discriminating between personal and non-personal web pages. The method can support surveys of personal opinions about products and services. In the proposed method, subjective expressions are extracted from pages and then the pages are scored by quantitatively evaluating the subjectivity in the pages. We have evaluated performances of the proposed method using 1200 web pages collected from four categories of product, tourist spot, restaurant, and movie. Comparing the performances of the proposed method with categorisations by a general search engine, we have confirmed that the performances have been significantly better in every category.
Keywords: document classification; personal web pages; subjective expressions; personal opinions; products; tourist destinations; restaurants; movies; films; personal preferences.
International Journal of Business Intelligence and Data Mining, 2009 Vol.4 No.1, pp.62 - 77
Available online: 21 May 2009 *Full-text access for editors Access for subscribers Purchase this article Comment on this article