Title: Discrimination of personal web pages by extracting subjective expressions

Authors: Takahiro Hayashi, Koji Abe, Debabrata Roy, Rikio Onai

Addresses: Faculty of Engineering, Department of Information Engineering, Niigata University, 8050 ikarashi-2-no-cho, Nishiku, Niigata, 950-2181, Japan. ' Department of Informatics, School of Science and Engineering, Kinki University, 3-4-1, Kowakae, Higashi-Osaka, Osaka, 577-8502, Japan. ' Department of Electrical Engineering, Bengal Engineering and Science University, 14 Gobra Road, Kolkata 700 046, India. ' Faculty of Electro-Communications, Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585, Japan

Abstract: This paper presents a method for discriminating between personal and non-personal web pages. The method can support surveys of personal opinions about products and services. In the proposed method, subjective expressions are extracted from pages and then the pages are scored by quantitatively evaluating the subjectivity in the pages. We have evaluated performances of the proposed method using 1200 web pages collected from four categories of product, tourist spot, restaurant, and movie. Comparing the performances of the proposed method with categorisations by a general search engine, we have confirmed that the performances have been significantly better in every category.

Keywords: document classification; personal web pages; subjective expressions; personal opinions; products; tourist destinations; restaurants; movies; films; personal preferences.

DOI: 10.1504/IJBIDM.2009.025411

International Journal of Business Intelligence and Data Mining, 2009 Vol.4 No.1, pp.62 - 77

Published online: 21 May 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article