Authors: Jingjin Li; Qingkui Chen; Bocheng Liu
Addresses: University of Shanghai for Science and Technology, 514 Military Road, Yangpu District, Shanghai, China ' University of Shanghai for Science and Technology, 514 Military Road, Yangpu District, Shanghai, China ' University of Shanghai for Science and Technology, 514 Military Road, Yangpu District, Shanghai, China
Abstract: Graphics processing units (GPUs) have shown increased popularity and play an important role as kind of coprocessor in heterogeneous co-processing environment. Heavily data parallel problems can be solved efficiently due to tens of thousands threads collaborative work in parallel GPU architecture. The achieved performance, therefore, depends on the capability of multiple threads in parallel collaboration. This paper, a static analytical kernel performance model (SAKP) was proposed to estimate the execution time of GPU kernel. Especially, a set of kernel and device features for target GPU is generated in the proposed model. We determine the performance limiting factors and generate an estimation of kernel execution time with this model. Matrix multiplication (MM) and histogram generation (HG) in NVIDIA GTX680 GPU card were performed to verify our proposed model and showed an absolute error in prediction less than 6.8%.
Keywords: graphics processing unit; GPU; co-processing; static analytical kernel performance model; SAKP; kernel and device features; absolute error.
International Journal of Computational Science and Engineering, 2019 Vol.18 No.2, pp.201 - 210
Received: 29 Jan 2016
Accepted: 09 Aug 2016
Published online: 14 Feb 2019 *