Automation of noise sampling in deep reinforcement learning
by Kunal Karda; Namit Dubey; Abhas Kanungo; Varun Gupta
International Journal of Applied Pattern Recognition (IJAPR), Vol. 7, No. 1, 2022

Abstract: The actor-critic models are generally prone to overestimation of sub-optimal policies and Q-values. Our proposed approach is established on value-based deep reinforcement learning algorithm also known as twin delayed deep deterministic policy gradient algorithm or TD3. The suggested approach is used to solve complex reinforcement learning problem like half-humanoid robot, ant, and half-cheetah to cover a path. This problem can only be solved with an algorithm which can work on continuous-action spaces, without much delaying the result to propagate during the inference of model. The proposed model has been adapted to converge faster to optimal Q-values. The TD3 uses two deep neural networks for learning two Q-values, viz., Q1 and Q2; in the proposed approach the Q-values average is being taken as an input for final Q-value unlike the other reinforcement learning algorithm such as DDPG which is prone to overestimate the Q-values. The proposed approach has also made self-adjusting noise clipping function, which make it harder for the policy to exploit Q-function errors to further improve performance.

Online publication date: Thu, 14-Apr-2022

The full text of this article is only available to individual subscribers or to users at subscribing institutions.

Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.

Pay per view:
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.

Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of Applied Pattern Recognition (IJAPR):
Login with your Inderscience username and password:

    Username:        Password:         

Forgotten your password?

Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.

If you still need assistance, please email