Article: Implementation matters of continuous-time deep reinforcement learning and its application in robotic arms Journal: International Journal of Web and Grid Services (IJWGS) 2025 Vol.21 No.3/4 pp.384 - 397 Abstract: Continuous-time reinforcement learning (CTRL) and discrete-time reinforcement learning (DTRL) have been successfully applied to various tasks. However, CTRL has garnered less attention, with research primarily targeting algorithmic enhancements and giving limited consideration to model parameters. This neglect of model parameters hampers reproducibility and performance tuning. In this paper, we conduct a large-scale experimental analysis of hyperparameter tuning in CTRL based on the Hamilton-Jacobi DQN (HJDQN) algorithm. We aim to improve CTRL by minimising performance variations from irreproducibility and misunderstandings, while reducing waste of computational resources. Experimental results on four Mujoco tasks reveal that larger <i>γ</i> values yield better performance, with optimal average rewards achieved at sampling intervals <i>h</i> of 0.2, 2.0 and 4.0, respectively. Additionally, experiments in the PandaGym environment demonstrate that the sampling interval's impact on performance aligns with its effect on the Q-value function, and smaller Lipschitz constraint constants facilitate agent learning. Inderscience Publishers - linking academia, business and industry through research

Title: Implementation matters of continuous-time deep reinforcement learning and its application in robotic arms

Authors: Jin-Qiang Wang; Yuanbo Jiang; Lirong Song; Dongyi Cai; Rui Zhou; Qingguo Zhou; Changyan Di

Addresses: School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China ' School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China

Abstract: Continuous-time reinforcement learning (CTRL) and discrete-time reinforcement learning (DTRL) have been successfully applied to various tasks. However, CTRL has garnered less attention, with research primarily targeting algorithmic enhancements and giving limited consideration to model parameters. This neglect of model parameters hampers reproducibility and performance tuning. In this paper, we conduct a large-scale experimental analysis of hyperparameter tuning in CTRL based on the Hamilton-Jacobi DQN (HJDQN) algorithm. We aim to improve CTRL by minimising performance variations from irreproducibility and misunderstandings, while reducing waste of computational resources. Experimental results on four Mujoco tasks reveal that larger γ values yield better performance, with optimal average rewards achieved at sampling intervals h of 0.2, 2.0 and 4.0, respectively. Additionally, experiments in the PandaGym environment demonstrate that the sampling interval's impact on performance aligns with its effect on the Q-value function, and smaller Lipschitz constraint constants facilitate agent learning.

Keywords: deep reinforcement learning; continuous-time; hyperparameter tuning; HJDQN.

DOI: 10.1504/IJWGS.2025.150182

International Journal of Web and Grid Services, 2025 Vol.21 No.3/4, pp.384 - 397

Received: 21 Oct 2024
Accepted: 22 Sep 2025
Published online: 02 Dec 2025 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article

Title: Implementation matters of continuous-time deep reinforcement learning and its application in robotic arms

Keep up-to-date