Title: PACC: a directive-based programming framework for out-of-core stencil computation on accelerators

Authors: Nobuhiro Miki; Fumihiko Ino; Kenichi Hagihara

Addresses: Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan ' Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan ' Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan

Abstract: We present a directive-based programming framework, i.e., the pipelined accelerator (PACC), to accelerate large-scale stencil computation on an accelerator device, such as a graphics processing unit (GPU). PACC provides a collection of extended OpenACC directives to facilitate out-of-core stencil computation accelerated using temporal blocking. The proposed framework includes a source-to-source translator capable of generating an out-of-core OpenACC code from the PACC code, i.e., large data is automatically decomposed into smaller chunks that are processed using limited capacity device memory. The generated code is optimised using a temporal blocking technique to minimise CPU-GPU data transfer. Furthermore, the code is accelerated using a multithreaded pipeline engine that maximises data copy throughput and overlaps GPU execution and data transfer. In experiments, we applied the proposed translator to three stencil computation codes. The out-of-core performance for 107 GB data on an NVIDIA Tesla K40 GPU with 12 GB memory reached 69.3 GFLOPS, which is 17% less than the in-core performance for 8 GB data. We believe that the proposed directive-based approach can be used to facilitate out-of-core stencil computation on a GPU.

Keywords: accelerator; directive-based programming; out-of-core execution; OpenACC; graphics processing unit; GPU.

DOI: 10.1504/IJHPCN.2019.097046

International Journal of High Performance Computing and Networking, 2019 Vol.13 No.1, pp.19 - 34

Received: 17 Feb 2017
Accepted: 23 Apr 2017

Published online: 17 Dec 2018 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article