Article: OpenMPC: extended OpenMP for efficient programming and tuning on GPUs Journal: International Journal of Computational Science and Engineering (IJCSE) 2013 Vol.8 No.1 pp.4 - 20 Abstract: General-purpose graphics processing units (GPGPUs) provide inexpensive, high performance platforms for compute-intensive applications. However, their programming complexity poses a significant challenge to developers. Even though the compute unified device architecture (CUDA) programming model offers better abstraction, developing efficient GPGPU code is still complex and error-prone. This paper proposes a directive-based, high-level programming model, called OpenMPC, which addresses both programmability and tunability issues on GPGPUs. We have developed a fully automatic compilation and user-assisted tuning system supporting OpenMPC. In addition to a range of compiler transformations and optimisations, the system includes tuning capabilities for generating, pruning, and navigating the search space of compilation variants. Evaluation using 14 applications shows that our system achieves 75% of the performance of the hand-coded CUDA programmes (92% if excluding one exceptional case). Inderscience Publishers - linking academia, business and industry through research

Title: OpenMPC: extended OpenMP for efficient programming and tuning on GPUs

Authors: Seyong Lee; Rudolf Eigenmann

Addresses: Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA ' School of Electrical and Computer Engineering, Purdue University, West Lafayette IN 47907, USA

Abstract: General-purpose graphics processing units (GPGPUs) provide inexpensive, high performance platforms for compute-intensive applications. However, their programming complexity poses a significant challenge to developers. Even though the compute unified device architecture (CUDA) programming model offers better abstraction, developing efficient GPGPU code is still complex and error-prone. This paper proposes a directive-based, high-level programming model, called OpenMPC, which addresses both programmability and tunability issues on GPGPUs. We have developed a fully automatic compilation and user-assisted tuning system supporting OpenMPC. In addition to a range of compiler transformations and optimisations, the system includes tuning capabilities for generating, pruning, and navigating the search space of compilation variants. Evaluation using 14 applications shows that our system achieves 75% of the performance of the hand-coded CUDA programmes (92% if excluding one exceptional case).

Keywords: OpenMP; graphics processing unit; GPU; compute unified device architecture; CUDA; OpenMPC; programming models; directives; automatic translation; compiler transformation; performance tuning; code generation; optimisation.

DOI: 10.1504/IJCSE.2013.052110

International Journal of Computational Science and Engineering, 2013 Vol.8 No.1, pp.4 - 20

Received: 21 Dec 2011
Accepted: 08 Mar 2012
Published online: 27 Dec 2013 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: OpenMPC: extended OpenMP for efficient programming and tuning on GPUs

Keep up-to-date