Global convergence of Policy Gradient Methods in Markov Decision Process

Speaker: Ziniu Li

Title: Global convergence of Policy Gradient Methods in Markov Decision Process

Time: Apr 9 2pm-5pm

Short Abstract: we consider policy gradients methods in markov decision process. We show that in general, the policy optimization problem is nonconvex but the gradient domination exists, which allows the global convergence. Specifically, we present the convergence analysis for projected gradient ascend algorithm.

Reference:

“Optimality and approximation with policy gradient methods in markov decision processes.” Annual Conference on Learning Theory, 2020.