Title: PTRE: a probabilistic two-phase replication elimination policy in large-scale distributed storage platforms
Authors: Ning Han; Dongbo Liu
Addresses: Department of Networking Engineering, Hunan Institute of Engineering, Xiangtan 411104, China ' College of Computer and Communication, Hunan Institute of Engineering, Xiangtan 411104, China
Abstract: To support large-scale data-intensive applications, massive distributed storage platform are being widely deployed in more and more IT-infrastructures. One of the most mentioned issues on distributed storage platform is how to maintain desirable data availability without too many extra costs. Therefore, data replication service plays a key role to achieve this goal. Unfortunately, many existing replication policies are designed for small-scale or centralised storage platforms, and their performance tends to be dramatically degraded when a system consists of thousands of autonomous storage nodes. In this paper, we present a novel replication policy that allows a storage platform to eliminate useless replicas and maintain sufficient data availability at the same time. Through theoretical analysis, we have proven that the costs of the proposed policy is linearly increased with the number of underlying storage nodes, which means that it can be easily applied in large-scale distributed storage platform. The experimental results indicate that the proposed replication scheme can significant improve the effective utilisation of storage resources comparing with other existing policies. In addition, it exhibits a better robustness when the underlying storage platform is in presence of dramatically fluctuant workload.
Keywords: distributed storage; replication scheme; availability; probability theory.
International Journal of Networking and Virtual Organisations, 2019 Vol.20 No.4, pp.340 - 355
Received: 29 Mar 2017
Accepted: 04 Jun 2017
Published online: 28 Jun 2019 *