Title: Adaptive hybrid partitioning for OLAP query processing in a database cluster

Authors: Camille Furtado, Alexandre A.B. Lima, Esther Pacitti, Patrick Valduriez, Marta Mattoso

Addresses: Computer Science Department, COPPE, Federal University of Rio de Janeiro (UFRJ), P.O. Box 68511, 21941-972 Rio de Janeiro, Brazil. ' School of Sciences and Technology, Unigranrio University, R. Prof. Jose de Souza Herdy, 1160, 25071-202, Duque de Caxias, Brazil. ' INRIA and LINA, University of Nantes, 2 rue de la Houssiniere, BP 92208, 44322 Nantes Cedex 3, France. ' INRIA and LINA, University of Nantes, 2 rue de la Houssiniere, BP 92208, 44322 Nantes Cedex 3, France. ' Computer Science Department, COPPE, Federal University of Rio de Janeiro (UFRJ), P.O. Box 68511, 21941-972 Rio de Janeiro, Brazil

Abstract: We consider the use of a database cluster for high-performance support of Online Analytical Processing (OLAP) applications. OLAP intra-query parallelism can be obtained by partitioning the database tables across cluster nodes. We propose to combine physical and virtual partitioning into a partitioning scheme called Adaptive Hybrid Partitioning (AHP). AHP requires less disk space while allowing for load balancing. We developed a prototype for OLAP parallel query processing in database clusters using AHP. Our experiments on a 32-node database cluster using the TPC-H benchmark demonstrate linear and super-linear speedup. Thus, AHP can reduce significantly the execution time of typical OLAP queries.

Keywords: database clusters; distributed database design; virtual partitioning; physical partitioning; dynamic load balancing; adaptive hybrid partitioning; OLAP queries; query processing; online analytical processing; high-performance computing.

DOI: 10.1504/IJHPCN.2008.022301

International Journal of High Performance Computing and Networking, 2008 Vol.5 No.4, pp.251 - 262

Published online: 27 Dec 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article