Title: Off-loading application controlled data prefetching in numerical codes for multi-core processors

Authors: J. Weidendorfer, C. Trinitis

Addresses: Institut fur Informatik, Technische Universitat Munchen, D-85747 Garching bei Munchen, Germany. ' Institut fur Informatik, Technische Universitat Munchen, D-85747 Garching bei Munchen, Germany

Abstract: An important issue when designing numerical code in High Performance Computing is cache optimisation in order to exploit the performance potential of a given target architecture. This includes techniques to improve memory access locality as well as prefetching. Inherent algorithm constrains often limit the first approach, which typically uses a blocking technique. While there exist automatic prefetching mechanisms in hardware and/or compilers, they can not complement blocking with additional prefetching. We provide an infrastructure for off-loading application controlled prefetching on a chip multiprocessor, allowing to further improve numerical code already optimised by standard cache optimisation. Clear benefits are shown for real workloads on existing hardware.

Keywords: chip multiprocessing; multi-core processors; cache optimisation; data prefetching; numerical codes; high performance computing.

DOI: 10.1504/IJCSE.2008.021109

International Journal of Computational Science and Engineering, 2008 Vol.4 No.1, pp.22 - 28

Published online: 04 Nov 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article