12 Advanced actor-critic methods
In this chapter:
- You learn about more advanced deep reinforcement learning methods, which are, to this day, the state-of-the-art algorithmic advancements in deep reinforcement learning.
- You learn about solving a variety of deep reinforcement learning problems, from problems with continuous-action spaces, to problem with high-dimensional action spaces.
- You build state-of-the-art actor-critic methods from scratch and open the door to understanding more advanced concepts related to artificial general intelligence.
Criticism may not be agreeable, but it is necessary. It fulfills the same function as pain in the human body. It calls attention to an unhealthy state of things.
— Winston Churchill, British politician, army officer, writer, and Prime Minister of the United Kingdom
In the last chapter, you learned about a different, more direct technique for solving deep reinforcement learning problems. You first were introduced to policy-gradient methods in which agents learn policies by approximating them directly. In pure policy-gradient methods, we do not use value functions as a proxy for finding policies, and in fact, we do not use value functions at all. We instead learn stochastic policies directly.