Title: Autonomic application management for large scale MPI programs

Authors: Aline P. Nascimento, Alexandre C. Sena, Jacques A. Silva, Daniela Q.C. Vianna, Cristina Boeres, Vinod E.F. Rebello

Addresses: Smart Grid Computing Laboratory, Instituto de Computacao, Universidade Federal Fluminense (UFF), Niteroi, RJ, Brazil. ' Smart Grid Computing Laboratory, Instituto de Computacao, Universidade Federal Fluminense (UFF), Niteroi, RJ, Brazil. ' Smart Grid Computing Laboratory, Instituto de Computacao, Universidade Federal Fluminense (UFF), Niteroi, RJ, Brazil. ' Smart Grid Computing Laboratory, Instituto de Computacao, Universidade Federal Fluminense (UFF), Niteroi, RJ, Brazil. ' Smart Grid Computing Laboratory, Instituto de Computacao, Universidade Federal Fluminense (UFF), Niteroi, RJ, Brazil. ' Smart Grid Computing Laboratory, Instituto de Computacao, Universidade Federal Fluminense (UFF), Niteroi, RJ, Brazil

Abstract: Computational grids aim to aggregate significant numbers of resources to provide sufficient, but low cost, computational power to various applications. Writing applications capable of executing efficiently in grids, is however extremely difficult. Their geographically distributed resources are typically heterogeneous, non-dedicated, and offer no performance or availability guarantees. This makes the collective management of resources and application both complex and arduous. This work investigates an alternative approach (based on system-awareness) to solve the problem of developing and managing the execution of grid applications efficiently. Results show that these system-aware applications are indeed faster than their conventional implementations and easily grid enabled.

Keywords: grid computing; MPI applications; autonomic applications; hybrid scheduling; fault-tolerant systems; message passing interface.

DOI: 10.1504/IJHPCN.2008.022299

International Journal of High Performance Computing and Networking, 2008 Vol.5 No.4, pp.227 - 240

Published online: 27 Dec 2008 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article