Title: Convergence and numerical stability of action-dependent heuristic dynamic programming algorithms based on RLS learning for online DLQR optimal control
Authors: Guilherme Bonfim De Sousa; Patrícia Helena Moraes Rêgo
Addresses: Master Program in Computer Engineering and Systems, State University of Maranhão, São Luís, MA, Brazil ' Mathematics and Computing Department, State University of Maranhão, São Luís, MA, Brazil
Abstract: The development and the numerical stability analysis of a novel algorithm of approximate dynamic programming (ADP) based on RLS learning for approximating the optimal control solution online in real-time are the main issues of this paper. The approximate dynamic programming is a method developed to make possible the use of dynamic programming techniques in real-time, but this method has a reasonable mathematical complexity due to the size of the internal matrices of the algorithm and the need for inversion of some of them. Thus, focusing on improving numerical stability and computational cost of ADP algorithms, more specifically in the action-dependent heuristic dynamic programming and optimal control context, UDUT-type unitary transformations are integrated in actor-critic architectures, which produce algorithms with better specifications for implementation in real-world optimal control systems. The control and stabilisation of the inverted pendulum system on a motor driven cart is established as a study platform to evaluate the convergence and numerical stability for the estimated parameters of the proposed algorithm.
Keywords: action-dependent heuristic dynamic programming; discrete linear quadratic regulator; recursive least-squares; numerical stability.
DOI: 10.1504/IJCSE.2019.103962
International Journal of Computational Science and Engineering, 2019 Vol.20 No.3, pp.317 - 334
Received: 23 Dec 2017
Accepted: 16 Dec 2018
Published online: 03 Dec 2019 *