Title: Algorithmic skeletons for multi-core, multi-GPU systems and clusters

Authors: Steffen Ernsting; Herbert Kuchen

Addresses: Department of Information Systems, University of Muenster, Leonardo-Campus 3, 48149 Muenster, Germany. ' Department of Information Systems, University of Muenster, Leonardo-Campus 3, 48149 Muenster, Germany

Abstract: Due to the lack of high-level abstractions, developers of parallel applications have to deal with low-level details such as coordinating threads or synchronising processes. Thus, parallel programming still remains a difficult and error-prone task. In order to shield the user from these low-level details, algorithmic skeletons have been proposed. They encapsulate typical parallel programming patterns and have emerged to be an efficient approach to simplifying the development of parallel applications. In this paper, we present our skeleton library Muesli, which not only simplifies parallel programming. Additionally, it allows to write a single application that may be executed on a variety of parallel machines ranging from simple multi-core processors with shared memory to clusters of multi- and many-core processors with distributed memory as well as multi-GPU systems and GPU clusters. The level of platform independence is not reached by other existing approaches, that simplify parallel programming. Internally, the skeletons are based on MPI, OpenMP and CUDA. We demonstrate portability and efficiency of our approach by providing experimental results.

Keywords: algorithmic skeletons; distributed computing; GPU computing; parallel programming; portable programming; programming environments; message passing; multiprocessing; shared memory systems; distributed memory systems; high performance computing; GPU clusters.

DOI: 10.1504/IJHPCN.2012.046370

International Journal of High Performance Computing and Networking, 2012 Vol.7 No.2, pp.129 - 138

Published online: 05 Apr 2012 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article