Title: A high performance redundancy scheme for cluster file systems

Authors: Manoj Pillai, Mario Lauria

Addresses: Department of Computer and Information Science, The Ohio State University, 2015 Neil Ave #395, Columbus, OH 43210, USA. ' Department of Computer and Information Science, The Ohio State University, 2015 Neil Ave #395, Columbus, OH 43210, USA

Abstract: A known issue in the design of striped file systems is their vulnerability to disk failures. In this paper, we address the challenges of augmenting an existing striped file system with traditional RAID redundancy, and propose a novel redundancy scheme designed to maximise disk throughput seen by applications. We implement our new scheme in CSAR, a proof-of-concept redundant file system based on the parallel virtual file system, along with two other well-known schemes. Our tests using both microbenchmarks and representative scientific applications show that our scheme consistently performs as well as the best of the other two schemes. The application-dependent, potentially larger storage occupation of our scheme is justified by current technological trends that put I/O bandwidth at a premium over disk space.

Keywords: parallel I/O; fault tolerance; partial redundancy; distributed RAID; cluster file systems.

DOI: 10.1504/IJHPCN.2004.008895

International Journal of High Performance Computing and Networking, 2004 Vol.2 No.2/3/4, pp.90 - 98

Published online: 02 Feb 2006 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article