Back close

Multiscale Q-learning with linear function approximation

Publication Type : Journal Article

Publisher : Discrete Event Dynamic Systems

Source : Discrete Event Dynamic Systems, Volume 26, Number 3, p.477–509 (2016)

Url :

Campus : Bengaluru

School : School of Engineering

Department : Computer Science

Year : 2016

Abstract : We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that the proposed algorithm converges almost surely to a closed connected internally chain transitive invariant set of an associated differential inclusion.

Cite this Research Publication : S. Bhatnagar and K., L., “Multiscale Q-learning with linear function approximation”, Discrete Event Dynamic Systems, vol. 26, pp. 477–509, 2016.

Admissions Apply Now