Title: Graph-based model and algorithm for minimising big data movement in a cloud environment
Authors: Samadi Yassir; Mostapha Zbakh; Tadonki Claude
Addresses: National School of Computer Science and Systems Analysis, Mohamed V University, Rabat, Morocco ' National School of Computer Science and Systems Analysis, Mohamed V University, Rabat, Morocco ' Computer Research Center (CRI), Mines ParisTech-PSL, Paris, France
Abstract: In this paper, we discuss load balancing and data placement strategies in cloud environments. The main goal in data placement strategies is to improve the overall performance through the reduction of data movements among the participating datacentres. Load balancing and efficient data placement on cloud systems are critical problems that are difficult to simultaneously cope with. In this context, we propose a threshold-based load balancing algorithm, which first balances the load between datacentres, and afterwards minimises the overhead of data exchanges. It is divided into three phases. First, the dependencies between the datasets are identified. Second, the load threshold of each datacentre is estimated based on the processing speed and the storage capacity. Third, the load balancing between the datacentres is managed through the threshold parameters. Our experimental results show that our approach can efficiently reduce the frequency of data movement and keep a good load balancing between the datacentres.
Keywords: graph model; big data; cloud computing; load balancing; data placement; data dependency; high performance computing.
International Journal of High Performance Computing and Networking, 2019 Vol.14 No.3, pp.365 - 375
Received: 14 Mar 2017
Accepted: 24 Jan 2018
Published online: 09 Sep 2019 *