Deep Rl Bootcamp Lecture 4b Policy Gradients Revisited