Authors: Fei Teng; Hao Yang; Tianrui Li; Frédéric Magoulès; Xiaoliang Fan
Addresses: School of Information Science and Technology, Southwest Jiaotong University, Chengdu, 610031, China ' School of Information Science and Technology, Southwest Jiaotong University, Chengdu, 610031, China ' School of Information Science and Technology, Southwest Jiaotong University, Chengdu, 610031, China ' Ecole Centrale Paris, Grande Voie des Vignes, Chatenay-Malabry, 92295, France ' School of Information Science and Engineering, Lanzhou University, Gansu, 730000, China
Abstract: Hadoop is a popular framework to process growing volumes of data across clusters of computers, and has achieved great success both in industry and academic researches. Although Hadoop has powerful batch processing capabilities, it can not support the real-time services, such as online payment or monitoring sensor data. These real-time services have strict deadlines in common, where service response after the deadline is considered useless. Current researches on time-constrained scheduling algorithms generally aim at shortening the completion time, rather than guaranteeing the specific latency for the real-time services. In this paper, we study the deadline-constrained scheduling problem on Hadoop, where service requests arrive randomly and no prior information is available. A maximum urgency scheduling (MUS) algorithm is proposed, and then implemented as a pluggable scheduler on Hadoop. This novel algorithm can be applied in heterogeneous environments with a low computation complexity. Experiments indicate that the MUS algorithm maximises the number of jobs meeting their deadlines while maintains the fairness among different types of jobs.
Keywords: deadline constraints; maximum urgency scheduling; MapReduce; cloud computing; Hadoop; real-time services.
International Journal of Computational Science and Engineering, 2015 Vol.11 No.4, pp.360 - 367
Received: 21 Sep 2013
Accepted: 28 Oct 2013
Published online: 10 Dec 2015 *