Title: A hybrid index structure for querying large string databases

Authors: Qiang Xue, Sakti Pramanik, Gang Qiang, Qiang Zhu

Addresses: Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA. ' Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA. ' Department of Computer Science, University of Central Oklahoma, Edmond, OK 73034, USA, ' Department of Computer and Information Science, The University of Michigan, Dearborn, MI 48128, USA

Abstract: Rapid growth in e-business has created a demand to efficiently index and search an increasingly large amount of strings. In this paper, we propose a hybrid RAM/disk-based index structure, called the HD-tree, making proper use of both RAM and disk spaces to achieve high performance for querying large string databases. Our experimental results demonstrate that the HD-tree outperforms the popular Prefix B-tree for prefix and substring searches. In particular, the average number of disk I/Os is reduced by a factor of two to three, and the total running time is approximately five times faster.

Keywords: large string databases; indexing methods; hybrid RAM/disk-based data structure; prefix searching; substring searching; e-business; electronic business; electronic documents; information retrieval; internet.

DOI: 10.1504/IJEB.2005.007269

International Journal of Electronic Business, 2005 Vol.3 No.3/4, pp.243 - 254

Published online: 30 Jun 2005 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article