Title: Network traffic driven storage repair

Authors: Danilo Gligoroski; Katina Kralevska; Rune E. Jensen; Per Simonsen

Addresses: Department of Information Security and Communication Technology, NTNU, Norwegian University of Science and Technology, Trondheim, Norway ' Department of Information Security and Communication Technology, NTNU, Norwegian University of Science and Technology, Trondheim, Norway ' Department of Computer Science, NTNU, Norwegian University of Science and Technology, Trondheim, Norway ' MemoScale AS, Kongens Gate 30 Trondheim 7012, Norway

Abstract: Recently we constructed an explicit family of locally repairable and locally regenerating codes. Their existence was proven by Kamath et al. but no explicit construction was given. Our design is based on HashTag codes that can have different sub-packetisation levels. In this work we emphasise the importance of having two ways to repair a node: repair only with local parity nodes or repair with both local and global parity nodes. We say that the repair strategy is network traffic driven since it is in connection with the concrete system and code parameters: the repair bandwidth of the code, the number of I/O operations, the access time for the contacted parts and the size of the stored file. We show the benefits of having repair duality in one practical example implemented in Hadoop. We also give algorithms for efficient repair of the global parity nodes.

Keywords: vector codes; repair bandwidth; repair locality; exact repair; parity-splitting; global parities; Hadoop.

DOI: 10.1504/IJBDI.2019.100888

International Journal of Big Data Intelligence, 2019 Vol.6 No.3/4, pp.212 - 223

Received: 08 Mar 2018
Accepted: 27 Jun 2018

Published online: 04 Jun 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article