Authors: P.J. Kumar; P. Ilango
Addresses: School of Information Technology and Engineering, IT University, Vellore, Tamil Nadu, India ' School of Computing Sciences and Engineering, VIT University, Vellore, Tamil Nadu, India
Abstract: The Hadoop distributed file system (HDFS) replicates data to ensure data availability in case of a failure caused by events such as data node crash, disk failure, switch/rack failure or corruption in the data block. The evolution of big data leads to large population of data stored and managed in the clusters of cloud. The degree of replication is directly proportional to availability of data with an increase in the replication cost and update cost of data blocks in cloud. The applications executed on the data nodes demand various QoS needs while a block of its data is replicated such as disk access latency, constant bandwidth, delay, jitter etc. Existing replication algorithms replicates data based on the replication factor and the specified QoS needs of application. At a given point of time we expect the types of replication request from different applications varies largely and there is a need to allocate replica based on the request type and the replication factor to achieve a balanced replication cost and availability of data with the available block spaces in the entire cluster. We propose a multi attribute QoS replica allocation algorithm to replicate data considering the different types of replica request, replication factor and the total available space to achieve a balanced replica allocation. The proposed algorithm satisfies different QoS needs of applications and reduces the number of QoS violated replicas when the request consists of different QoS types. We measure the performance of the proposed algorithm in allocating replica and the reduction in number of QoS violated replica count over the existing algorithms such as random replication. The simulation result shows a better performance over the existing algorithms with a slight increase in the computational time.
Keywords: hadoop distributed file system; HDFS; replication; multi attribute QoS aware replica allocation; QoS violation.
International Journal of Internet Technology and Secured Transactions, 2018 Vol.8 No.2, pp.195 - 208
Received: 30 Mar 2017
Accepted: 17 May 2017
Published online: 25 Jul 2018 *