Title: Design of the notification system for failure detectors

Authors: Naohiro Hayashibara, Makoto Takizawa

Addresses: Faculty of Computer Science and Engineering, Department of Computer Science, Kyoto Sangyo University, Japan. ' Faculty of Science and Technology, Department of Computers and Information Science, Seikei University, Japan

Abstract: It is widely recognised that distributed systems would greatly benefit from the availability of a generic failure detection service. In this paper, we highlighted the issue on the construction of the monitoring network of failure detectors. We proposed an algorithm to construct and manage the monitoring network that each failure detector is monitored by some failure detectors. Notification of failures is propagated along the network. Especially it can involve various types of failure detectors from simple timeout-based failure detectors to accrual failure detectors, and help to spread information on suspected processes/nodes. In addition, we have made a simulation of the proposed algorithm for constructing the monitoring network. It shows that the algorithm is scalable for increasing the number of failure detectors.

Keywords: failure detection; accrual failure detectors; overlay network; topology; self-organising algorithms; distributed systems; monitoring networks; simulation.

DOI: 10.1504/IJHPCN.2009.026289

International Journal of High Performance Computing and Networking, 2009 Vol.6 No.1, pp.25 - 34

Published online: 05 Jun 2009 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article