Title: Vector space search engines that maximise expected user utility

Authors: Nilgun Ferhatosmanoglu, Theodore T. Allen, Guadalupe Canahuate

Addresses: Department of Industrial and Systems Engineering, The Ohio State University, 1971 Neil Avenue, 210 Baker Systems, Columbus, Ohio 43210 1271, USA. ' Department of Industrial and Systems Engineering, The Ohio State University, 1971 Neil Avenue, 210 Baker Systems, Columbus, Ohio 43210 1271, USA. ' Department of Computer Science and Engineering, 395, Dreese Laboratories, 2015 Neil Avenue, Columbus, OH 43210 1277, USA

Abstract: Vector space methods are perhaps the most widely studied type of search engine. Yet, these search engines are generally not optimal in the sense that the search results are based on the current query and the available database without considering information about the user preferences. This article establishes a rigorous relationship between the tuning of dimensional weights and the maximisation of the expected utilities of users. The methods can be implemented using standard software for discrete choice analysis and readily available data. The proposed methodology is called |discrete choice analysis weighting| (DCAW). The test-bed evaluation of DCAW conducted on around 10,000 news data offers promising results for further studies. Also, several opportunities for future research are proposed.

Keywords: DCA; discrete choice analysis; information retrieval; LSI; latent semantic indexing; vector space; search engines; dimensional weights; tuning; expected user utility.

DOI: 10.1504/IJMOR.2009.024287

International Journal of Mathematics in Operational Research, 2009 Vol.1 No.3, pp.279 - 302

Published online: 31 Mar 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article