Title: A novel bicriteria list scheduling with duplication for heterogeneous distributed systems

Authors: Weipeng Jing; Yaqiu Liu; Qu Wu

Addresses: Department of Computer Science and Technology, School of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China. ' Department of Computer Science and Technology, School of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China. ' Department of Computer Science and Technology, School of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China

Abstract: In order to achieve high performance in heterogeneous distributed systems, the efficient scheduling strategy is critical. Most of the heuristics for this NP-hard problem are based on a very simple system model of the target parallel system where communication contention is not taken into account. In this paper we propose a novel bicriteria list scheduling framework for the (schedule length and reliability) on communication contention model. We take the primary-backup replication technique of each individual task of the dependency task graph given as a specification with respect to the desired failure rate. To solve this bicriteria optimisation problem, we consider the processor and the communication link failure rate as a constraint, and we minimise the schedule length. We are thus able to use a Pareto curve of non-dominated solutions so we can choose the compromise that fits user requirements best. Another innovation is that we realise the case without offline planned schedules and discuss dynamic planning and adaptation of DAG tasks. Experimental results fully demonstrate the usefulness of the proposed algorithms, which lead to efficient execution schemes, while guaranteeing a prescribed level of fault-tolerance.

Keywords: heterogeneous systems; distributed systems; communication contention; primary-backup replication; bicriteria optimisation; bicriteria list scheduling; duplication; schedule length; schedule reliability; dependency task graph; failure rate; dynamic planning; efficient execution; fault tolerance.

DOI: 10.1504/IJMIC.2012.051083

International Journal of Modelling, Identification and Control, 2012 Vol.17 No.4, pp.315 - 325

Published online: 17 Dec 2014 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article