Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method Online publication date: Wed, 30-Apr-2008
by T. Zeiser, G. Wellein, A. Nitsure, K. Iglberger, U. Rude, G. Hager
Progress in Computational Fluid Dynamics, An International Journal (PCFD), Vol. 8, No. 1/2/3/4, 2008
Abstract: In this report we propose a parallel cache oblivious spatial and temporal blocking algorithm for the lattice Boltzmann method in three spatial dimensions. The algorithm has originally been proposed by Frigo et al. (1999) and divides the space-time domain of stencil-based methods in an optimal way, independently of any external parameters, e.g., cache size. In view of the increasing gap between processor speed and memory performance this approach offers a promising path to increase cache utilisation. We find that even a straightforward cache oblivious implementation can reduce memory traffic at least by a factor of two if compared to a highly optimised standard kernel and improves scalability for shared memory parallelisation. Due to the recursive structure of the algorithm we use an unconventional parallelisation scheme based on task queuing.
Online publication date: Wed, 30-Apr-2008
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the Progress in Computational Fluid Dynamics, An International Journal (PCFD):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email firstname.lastname@example.org