RL Seminar
  • About
  • Current schedule
  • Other semesters
  • Resources
  • GitHub

Online Target Q-Learning with Reverse Experience Replay - Efficiently Finding The Optimal Policy for Linear MDPs

Ziniu Li (CUHK-Shenzhen)