Title: Minimum mutable checkpoint-based coordinated checkpointing protocol for mobile distributed systems

Authors: Lalit K. Awasthi; Manoj Misra; R.C. Joshi

Addresses: Department of Computer Science and Engineering, National Institute of Technology, Hamirpur, Himachal Pradesh, PIN 177005, India ' Department of Electronics and Computer Engineering, Indian Institute of Technology, Roorkee, PIN 247667, India ' Graphics Era University, 566/6, Bell Road, Clement Town, Dehradun, Uttarakhand, 248002, India

Abstract: Coordinated checkpointing is an attractive approach to add fault tolerance to mobile distributed systems transparently against various predictable and unexpected faults. This approach avoids domino effect and require minimum stable storage but may require additional synchronisation messages and blocking. Coordinated checkpointing overhead can be minimised either by minimising processes to checkpoint or by making checkpointing process non-blocking. In literature it has been reported that there cannot be a minimum process non-blocking coordinated checkpointing protocol. Minimisation of processes to checkpoint for an initiation can be combined with non-blocking by taking some additional checkpoints that may be discarded on completion of the checkpoint protocol (unnecessary checkpoints). Such checkpoints are called mutable if stored on local memory of mobile host temporarily. In this paper, we investigated this problem further and designed an efficient coordinated checkpointing protocol that is non-blocking, requires coordination of only minimum number of processes and reduces the overhead of unnecessary checkpoints significantly. Simulation studies show that our protocol reduces the number of unnecessary checkpoints almost to zero.

Keywords: mobile systems; distributed systems; mutable checkpoints; fault tolerance; coordinated checkpointing; domino effect; consistent global state; simulation.

DOI: 10.1504/IJCNDS.2014.062226

International Journal of Communication Networks and Distributed Systems, 2014 Vol.12 No.4, pp.356 - 380

Published online: 21 Jun 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article