Authors: Yaokai Feng, Kunihiko Kaneko, Akifumi Makinouchi
Addresses: Graduate School of Information Science and Electrical Engineering, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, Japan. ' Graduate School of Information Science and Electrical Engineering, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, Japan. ' Department of Information Network Engineering, Kurume Institute of Technology, 2228-66 Uetsu-Machi, Kurume, Japan
Abstract: In light of the increasing requirement for processing multidimensional queries on OLAP (relational) data, the database community has focused on the queries (especially range queries) on the large OLAP datasets from the view of multidimensional data. It is well-known that multidimensional indices are helpful to improve the performance of such queries. However, we found that much information irrelevant to queries also has to be read from disk if the existing multidimensional indices are used with OLAP data, which greatly degrade the search performance. This problem comes from particularity on the actual queries exerted on OLAP data. That is, in many OLAP applications, the query conditions probably are only with partial dimensions (not all) of the whole index space. Such range queries are called partially-dimensional (PD) range queries in this study. Based on R*-tree, we propose a new index structure, called AR*-tree, to counter the actual queries on OLAP data. The results of both mathematical analysis and many experiments with different datasets indicate that the AR*-tree can clearly improve the performance of PD range queries, esp. for large OLAP datasets.
Keywords: OLAP datasets; multidimensional index; multidimensional range queries; R*-tree; relational data; B+-tree; search performance.
International Journal of Data Mining, Modelling and Management, 2011 Vol.3 No.2, pp.150 - 171
Published online: 24 Jul 2011 *Full-text access for editors Access for subscribers Purchase this article Comment on this article