Title: Incremental models for query clustering and query-context aware document clustering

Authors: Poonam Goyal; N. Mehala; Navneet Goyal

Addresses: Department of Computer Science, Birla Institute of Technology and Science, Pilani, 333 031, India ' Department of Computer Science, Birla Institute of Technology and Science, Pilani, 333 031, India ' Department of Computer Science, Birla Institute of Technology and Science, Pilani, 333 031, India

Abstract: The traditional query clustering algorithms are designed to work on previously collected data from query stream. These algorithms become less and less effective with time because users' interests, query meaning and popularity of topics change over time. So, there is a need for incremental algorithms which can accommodate the concept drift that surface with new data being added to the collection without performing a complete re-clustering. We have proposed an incremental model for query and query-context aware document clustering. The model periodically updates new information efficiently and can be applied in a distributed environment. The proposed incremental model retains the quality of both query and document clusters. The proposed model can be applied to the results of hierarchical query clustering algorithms that produce query and document clusters. The model is tested on three hierarchical clustering algorithms on different datasets including TREC session track 2011 dataset. We have also experimented with the variant of the proposed incremental model for comparing the performance. The proposed model and its variant not only achieve accuracy very close to that of static models in all the experiments, but also offer a significant speedup.

Keywords: query context; concept awareness; click-through log; document clustering; incremental clustering; query clustering; concept drift; information updating.

DOI: 10.1504/IJKWI.2015.075167

International Journal of Knowledge and Web Intelligence, 2015 Vol.5 No.2, pp.146 - 167

Received: 07 Feb 2015
Accepted: 10 Aug 2015

Published online: 05 Mar 2016 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article