Title: Implementation matters of continuous-time deep reinforcement learning and its application in robotic arms
Authors: Jin-Qiang Wang; Yuanbo Jiang; Lirong Song; Dongyi Cai; Rui Zhou; Qingguo Zhou; Changyan Di
Addresses: School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
Abstract: Continuous-time reinforcement learning (CTRL) and discrete-time reinforcement learning (DTRL) have been successfully applied to various tasks. However, CTRL has garnered less attention, with research primarily targeting algorithmic enhancements and giving limited consideration to model parameters. This neglect of model parameters hampers reproducibility and performance tuning. In this paper, we conduct a large-scale experimental analysis of hyperparameter tuning in CTRL based on the Hamilton-Jacobi DQN (HJDQN) algorithm. We aim to improve CTRL by minimising performance variations from irreproducibility and misunderstandings, while reducing waste of computational resources. Experimental results on four Mujoco tasks reveal that larger γ values yield better performance, with optimal average rewards achieved at sampling intervals h of 0.2, 2.0 and 4.0, respectively. Additionally, experiments in the PandaGym environment demonstrate that the sampling interval's impact on performance aligns with its effect on the Q-value function, and smaller Lipschitz constraint constants facilitate agent learning.
Keywords: deep reinforcement learning; continuous-time; hyperparameter tuning; HJDQN.
DOI: 10.1504/IJWGS.2025.150182
International Journal of Web and Grid Services, 2025 Vol.21 No.3/4, pp.384 - 397
Received: 21 Oct 2024
Accepted: 22 Sep 2025
Published online: 02 Dec 2025 *