Title: Graph-based model and algorithm for minimising big data movement in a cloud environment

Authors: Samadi Yassir; Mostapha Zbakh; Tadonki Claude

Addresses: National School of Computer Science and Systems Analysis, Mohamed V University, Rabat, Morocco ' National School of Computer Science and Systems Analysis, Mohamed V University, Rabat, Morocco ' Computer Research Center (CRI), Mines ParisTech-PSL, Paris, France

Abstract: In this paper, we discuss load balancing and data placement strategies in cloud environments. The main goal in data placement strategies is to improve the overall performance through the reduction of data movements among the participating datacentres. Load balancing and efficient data placement on cloud systems are critical problems that are difficult to simultaneously cope with. In this context, we propose a threshold-based load balancing algorithm, which first balances the load between datacentres, and afterwards minimises the overhead of data exchanges. It is divided into three phases. First, the dependencies between the datasets are identified. Second, the load threshold of each datacentre is estimated based on the processing speed and the storage capacity. Third, the load balancing between the datacentres is managed through the threshold parameters. Our experimental results show that our approach can efficiently reduce the frequency of data movement and keep a good load balancing between the datacentres.

Keywords: graph model; big data; cloud computing; load balancing; data placement; data dependency; high performance computing.

DOI: 10.1504/IJHPCN.2019.102136

International Journal of High Performance Computing and Networking, 2019 Vol.14 No.3, pp.365 - 375

Received: 14 Mar 2017
Accepted: 24 Jan 2018

Published online: 03 Sep 2019 *

Full-text access for editors Access for subscribers Purchase this article Comment on this article