Authors: João Vicente Ferreira Lima; Daniel Di Domenico
Addresses: Universidade Federal de Santa Maria, Santa Maria, Rio Grande do Sul, Brazil ' Universidade Federal de Santa Maria, Santa Maria, Rio Grande do Sul, Brazil
Abstract: This paper presents a high-level C++ framework to explore multi-CPU and multi-GPU systems called HPSM. HPSM enables execution of parallel loops and reductions simultaneously over CPUs and GPUs using three parallel backends: Serial, OpenMP, and StarPU. We analysed HPSM development effort with AXPY program through two standard metrics (NCLOC and ES). In addition, we evaluated performance and energy with three parallel benchmarks: N-Body, Hotspot, and CFD solver. HPSM reduced code effort by up to 56.9% compared to StarPU C interface, although it resulted in 2.5× more lines of code compared to OpenMP. The CPU-GPU combination attained speedup results with Hotspot of up to 92.7× on a X86-based system with four GPUs and up to 108.2× on an IBM POWER8+ system with two GPUs. On both systems, the addition of GPUs improved energy efficiency.
Keywords: high performance computing; CPU-GPU systems; parallel programming models; high-level framework; parallel loops.
International Journal of Grid and Utility Computing, 2019 Vol.10 No.3, pp.201 - 211
Received: 13 Mar 2018
Accepted: 04 Aug 2018
Published online: 15 May 2019 *