Title: Research on multi-UAV task decision-making based on improved MADDPG algorithm and transfer learning

Authors: Bo Li; Shiyang Liang; Zhigang Gan; Daqing Chen; Peixin Gao

Addresses: School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China ' School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China ' School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China ' School of Engineering, London South Bank University, London/SE1 0AA, UK ' School of Electronics and Information, Northwestern Polytechnical University, Xi'an 710072, China

Abstract: At present, the intelligent algorithms of multi-UAV task decision-making have been suffering some major issues, such as, slow learning speed and poor generalisation capability, and these issues have made it difficult to obtain expected learning results within a reasonable time and to apply a trained model in a new environment. To address these problems, an improved algorithm, namely PMADDPG, based on multi-agent deep deterministic policy gradient (MADDPG) is proposed in this paper. This algorithm adopts a two-layer experience pool structure in order to achieve the priority experience replay. Experiences are stored in an experience pool of the first layer, and then, experiences more conducive to training and learning are selected according to priority criteria and put into an experience pool of the second layer. Furthermore, the experiences from the experience pool of the second layer are selected for model training based on PMADDPG algorithm. In addition, a model-based environment transfer learning method is designed to improve the generalisation capability of the algorithm. Comparative experiments have shown that, compared with MADDPG algorithm, proposed algorithms can scientifically improve the learning speed, task success rate and generalisation capability.

Keywords: multi-UAV task decision; improved MADDPG algorithm; two-layer experience pool; transfer learning.

DOI: 10.1504/IJBIC.2021.118087

International Journal of Bio-Inspired Computation, 2021 Vol.18 No.2, pp.82 - 91

Received: 20 Aug 2020
Accepted: 20 Nov 2020

Published online: 12 Oct 2021 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article