Introduction to reinforcement learning

See Slides and recorded Video for the lecture on youtube.

MDP framework, Terminology, Bellman equation
- Markov Decision Process
- Bellman equation
- Bellman optimal equation
- Infinite horizon discounted problem
- Fix point iteration (Contraction mapping)
Dynamic Programming
- Value iteration (VI)
- Policy iteration (PI)
  - Policy Evaluation
  - Policy Improvement
Approximate PI
- Bellman Error
- Tabular TD(0)-learning
Q-factor
- Q-learning as stochastic VI (off policy)
- optimistic PI for Q-factors: SARSA (on policy)

Comparison of reinforcement learning algorithms

Discuss in the google groups!