Title: A dependability layer for large-scale distributed systems

Authors: Valentin Cristea, C. Dobre, F. Pop, C. Stratan, A. Costan, C. Leordeanu, E. Tirsa

Addresses: Department of Computer Science, University Politehnica of Bucharest, Spl. Independentei 313, Bucharest, Romania. ' Department of Computer Science, University Politehnica of Bucharest, Spl. Independentei 313, Bucharest, Romania. ' Department of Computer Science, University Politehnica of Bucharest, Spl. Independentei 313, Bucharest, Romania. ' Department of Computer Science, University Politehnica of Bucharest, Spl. Independentei 313, Bucharest, Romania. ' Department of Computer Science, University Politehnica of Bucharest, Spl. Independentei 313, Bucharest, Romania. ' Department of Computer Science, University Politehnica of Bucharest, Spl. Independentei 313, Bucharest, Romania. ' Department of Computer Science, University Politehnica of Bucharest, Spl. Independentei 313, Bucharest, Romania

Abstract: Ensuring dependability in large-scale distributed systems represents an important research subject today. Despite the fact that many projects obtained valuable results in this domain, no acceptable solution was yet found that could integrate all the requirements for designing a dependable system and that could exploit all the capabilities of modern systems. We present a unitary and aggregate approach to ensuring reliability, availability, safety and security of distributed systems. Starting from the proposed architecture, we present implementation details for two solutions designed to ensure fault tolerance, using virtualisation and container-based replication of services. We also present an approach to enhance security using combined modern security models in large-scale distributed systems. The results and implementation details can serve as a methodology to assist distributed infrastructures in adopting such a middleware layer designed to enforce dependability in large-scale distributed systems.

Keywords: dependability; large-scale systems; distributed systems; virtualisation; security; fault tolerance; reliability; availability; safety; container-based replication; middleware layers; grid computing; distributed computing.

DOI: 10.1504/IJGUC.2011.040598

International Journal of Grid and Utility Computing, 2011 Vol.2 No.2, pp.109 - 118

Published online: 28 Mar 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article