Title: Cooperative cache analysis for distributed search engines

Authors: David Dominguez-Sal, Marta Perez-Casany, Josep Lluis Larriba-Pey

Addresses: Computer Architecture Department, DAMA-UPC, Universitat Politecnica de Catalunya, Jordi Girona 1-3, Barcelona, Spain. ' Applied Mathematics II Department, DAMA-UPC, Universitat Politecnica de Catalunya, Jordi Girona 1-3, Barcelona, Spain. ' Computer Architecture Department, DAMA-UPC, Universitat Politecnica de Catalunya, Jordi Girona 1-3, Barcelona, Spain

Abstract: In this paper, we study the performance of a distributed search engine from a data caching point of view using statistical tools on a varied set of configurations. We study two strategies to achieve better performance: cache-aware load balancing that issues the queries to nodes that store the computation in cache; and cooperative caching (CC) that stores and transfers the available computed contents from one node in the network to others. Since cache-aware decisions depend on information about the recent history, we also analyse how the ageing of this information impacts the system performance. Our results show that the combination of both strategies yield better throughput than individually implementing cooperative cache or cache-aware load balancing strategies because of a synergic improvement of the hit rate. Furthermore, the analysis concludes that the data structures to monitor the system need only moderate precision to achieve optimal throughput.

Keywords: cooperative cache; cache-aware load balancing; question answering; information retrieval; distributed search engines; data caching.

DOI: 10.1504/IJITCC.2010.035226

International Journal of Information Technology, Communications and Convergence, 2010 Vol.1 No.1, pp.41 - 65

Published online: 16 Sep 2010 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article