Title: Parallel lossless data compression using the Burrows-Wheeler Transform

 

Author: Jeff Gilchrist, Aysegul Cuhadar

 

Addresses:
Department of Systems and Computer Engineering, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, K1S 5B6, Canada.
Department of Systems and Computer Engineering, Carleton University, 1125 Colonel By Drive, Ottawa, Ontario, K1S 5B6, Canada

 

Journal: Int. J. of Web and Grid Services, 2008 Vol.4, No.1, pp.117 - 135

 

Abstract: In this paper, we present parallel algorithms for lossless data compression based on the Burrows-Wheeler Transform (BWT) block-sorting technique. We investigate the performance of using data parallelism and task parallelism for both multi-threaded and message-passing programming. The output produced by the parallel algorithms is fully compatible with their sequential counterparts. To balance the workload among processors we develop a task scheduling strategy. An extensive set of experiments is performed with a shared memory NUMA system using up to 120 processors and on a distributed memory cluster using up to 100 processors. Our experimental results show that significant speedup can be achieved with both data parallel and task parallel methodologies. These algorithms will greatly reduce the amount of time it takes to compress large amounts of data while the compressed data remains in a form that users without access to multiple processor systems can still use.

 

Keywords: lossless data compression; Burrows-Wheeler Transform; BWT; parallel computing; distributed computing; bzip2; message passing interface; MPI; multi-threading; data parallelism; task parallelism; workload balance; task scheduling.

 

DOI: http://dx.doi.org/10.1504/IJWGS.2008.018498

 

 

Editors Full Text AccessAccess for SubscribersPurchase this articleComment on this article