Title: Impact of replica placement-based clustering on fault tolerance in grid computing

Authors: Rahma Souli-Jbali; Minyar Sassi Hidri; Rahma Ben Ayed

Addresses: National Engineering School of Tunis, University of Tunis El Manar, Tunisia ' Deanship of Preparatory Year and Supporting Studies, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia ' National Engineering School of Tunis, University of Tunis El Manar, Tunisia

Abstract: Due to several demands on very high computing power and storage capacity, data grids seem to be a good solution to meet these growing demands. However, the design of distributed applications for data grids remains complex, and it is necessary to take into account the dynamic nature of the grids since the nodes may disappear at any time. We focus on problems related to the impact of replica placement-based clustering on fault tolerance in grids. In inter-clusters, the message-logging protocol is used. In intra-cluster, the inter-clusters protocol is coupled with the non-blocking coordinated checkpoint of Chandy-Lamport. This ensures that in case of failure, the impact of the fault would remain confined to the nodes of the same cluster. The experiment results show the efficiency of the proposed protocol in terms of time recovery, numbers of either used processes or exchanged messages.

Keywords: data grids; fault tolerance; replica placement; clustering; job scheduling.

DOI: 10.1504/IJWET.2019.102873

International Journal of Web Engineering and Technology, 2019 Vol.14 No.2, pp.151 - 177

Published online: 03 Oct 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article