Multi-step learning and Value-based approximation methods

See Slides and recorded Video for the lecture on youtube.

$TD(\lambda)$
TD-learning with linear approximation of the value function
- $LSPE(\lambda)$
- $LSTD(\lambda)$
Convergence issues