Title: Improved concept-based query expansion using Wikipedia

Authors: M. Yuvarani; N.Ch.S.N. Iyengar; A. Kannan

Addresses: Lead, Education and Research, Infosys Limited, Bangalore 560 100 Karnataka, India ' School of Computing Science and Engineering, VIT University, Vellore 632 014, Tamil Nadu, India ' Department of Information Science and Technology, Anna University, Chennai 600025 Tamil Nadu, India

Abstract: The query formulation has always been a challenge for the users. In this paper, we propose a novel interactive query expansion methodology that identifies and presents the potential directions (generalised concepts) for the given query enabling the user to explore the interested topic further. The methodology proposed is concept-based direction (CoD) finder which relies on the external knowledge repository for finding the directions. Wikipedia, the most important non-profit crowdsourcing project, is considered as the external knowledge repository for CoD finder methodology. CoD finder identifies the concepts for the given query and derives the generalised direction for each of the concepts, based on the content of the Wikipedia article and the categories it belongs to. The CoD finder methodology has been evaluated in the crowdsourcing marketplace - Amazon's Mechanical Turk - for measuring the quality of the identified potential directions. The evaluation result shows that the potential directions identified by the CoD finder methodology produces better precision and recall for the given queries.

Keywords: index terms; crowdsourcing; query expansion; direction finder; web search; query formulation; Wikipedia; concept-based queries; information retrieval.

DOI: 10.1504/IJCNDS.2013.054833

International Journal of Communication Networks and Distributed Systems, 2013 Vol.11 No.1, pp.26 - 41

Published online: 28 Feb 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article