Int. J. of Grid and Utility Computing   »   2011 Vol.2, No.4

 

 

Title: Operating two InfiniBand grid clusters over 28 km distance

 

Author: Sabine Richling, Steffen Hau, Heinz Kredel, Hans-Günther Kruse

 

Addresses:
IT Center, University of Heidelberg, 69120 Heidelberg, Germany.
IT Center, University of Mannheim, 68131 Mannheim, Germany.
IT Center, University of Mannheim, 68131 Mannheim, Germany.
IT Center, University of Mannheim, 68131 Mannheim, Germany

 

Abstract: This paper considers an InfiniBand connection between two bwGRiD clusters over a distance of 28 km in day-to-day production use. We discuss the hardware set-up of InfiniBand messages converted and transported over a fibre optic connection. The two clusters can be operated as single system image, the batch system will enforce that all nodes for a job are allocated on one side of the cluster. This is to optimise MPI performance, which would not be sufficient for communication between nodes on opposite sides of the 28 km connection. We report on the successful solution of all technical and organisational integration hurdles. By a simple performance model we discuss the anticipated costs for a doubling in communication performance.

 

Keywords: operating clusters; long-distance InfiniBand; performance modelling; grid utility; grid computing; grid clusters; fibre optics; communication performance.

 

DOI: 10.1504/IJGUC.2011.042946

 

Int. J. of Grid and Utility Computing, 2011 Vol.2, No.4, pp.303 - 312

 

Submission date: 30 Dec 2010
Date of acceptance: 15 May 2011
Available online: 08 Oct 2011

 

 

Editors Full text accessAccess for SubscribersPurchase this articleComment on this article