Title: Comparing the performance of stochastic simulation on GPUs and OpenMP

Authors: Weijun Xiao; Peng Li; David J. Lilja

Addresses: Department of Electrical and Computer Engineering University of Minnesota, Twin Cities, Minneapolis, MN, 55455, USA ' Department of Electrical and Computer Engineering University of Minnesota, Twin Cities, Minneapolis, MN, 55455, USA ' Department of Electrical and Computer Engineering University of Minnesota, Twin Cities, Minneapolis, MN, 55455, USA

Abstract: Since stochastic computing performs operations using streams of bits that represent probability values instead of deterministic values, it can tolerate a large number of failures in a noisy system. However, the simulation of a stochastic implementation is extremely time-consuming. In this paper, we investigate two approaches to speed up the stochastic simulation: a GPU-based simulation and an OpenMP-based simulation. To compare these two approaches, we start with several basic stochastic computing elements (SCEs) and then use the stochastic implementation of a frame difference-based image segmentation algorithm as case study to conduct extensive experiments. Measured results show that the GPU-based simulation with 448 processing elements can achieve up to 119x performance speedup compared to the single-threaded CPU simulation and 17x performance speedup over the OpenMP-based simulation with eight processor cores. In addition, we present several performance optimisations for the GPU-based simulation which significantly benefit the performance of stochastic simulation.

Keywords: parallel computing; GPU; graphics processing unit; stochastic simulation; image processing; fault tolerance; OpenMP; image segmentation.

DOI: 10.1504/IJCSE.2013.052111

International Journal of Computational Science and Engineering, 2013 Vol.8 No.1, pp.34 - 46

Received: 28 Jan 2012
Accepted: 19 Mar 2012

Published online: 27 Dec 2013 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article