Authors: Ghalem Belalem; Said Limam
Addresses: Department of Computer Science, Faculty of Sciences, University of Oran, B.P. 1524, EL M'Naouer, Oran, 31000, Algeria ' Department of Computer Science, Faculty of Sciences, University of Oran, B.P. 1524, EL M'Naouer, Oran, 31000, Algeria
Abstract: Reliability refers to the probability that a system will offer failure-free service for a specified period of time within the bounds of a specified environment. For the cloud, reliability is broadly a function of the reliability of four individual components: 1) the hardware and software facilities offered by providers; 2) the provider's personnel; 3) connectivity to the subscribed services; 4) the subscriber's personnel. It is too expensive to provide redundant alternative components for all the cloud components. To reduce the cost and to develop highly reliable cloud within the limited budget, we proposed in this paper a fault tolerant architecture to cloud computing that uses a dynamic and adaptive checkpoint mechanism to provide a reliable cloud computing system.
Keywords: fault tolerance; cloud computing; virtualisation; checkpointing; resource management; reliability.
International Journal of Communication Networks and Distributed Systems, 2013 Vol.11 No.3, pp.236 - 249
Received: 08 May 2021
Accepted: 12 May 2021
Published online: 13 Aug 2013 *