12 Advanced actor-critic methods

 

In this chapter

  • You will learn about more advanced deep reinforcement learning methods, which are, to this day, the state-of-the-art algorithmic advancements in deep reinforcement learning.
  • You will learn about solving a variety of deep reinforcement learning problems, from problems with continuous action spaces, to problem with high-dimensional action spaces.
  • You will build state-of-the-art actor-critic methods from scratch and open the door to understanding more advanced concepts related to artificial general intelligence.

Criticism may not be agreeable, but it is necessary. It fulfills the same function as pain in the human body. It calls attention to an unhealthy state of things.

— Winston Churchill British politician, army officer, writer, and Prime Minister of the United Kingdom

In the last chapter, you learned about a different, more direct, technique for solving deep reinforcement learning problems. You first were introduced to policy-gradient methods in which agents learn policies by approximating them directly. In pure policy-gradient methods, we don’t use value functions as a proxy for finding policies, and in fact, we don’t use value functions at all. We instead learn stochastic policies directly.

DDPG: Approximating a deterministic policy

 

DDPG uses many tricks from DQN

 
 
 

Learning a deterministic policy

 
 

Exploration with deterministic policies

 
 
 

TD3: State-of-the-art improvements over DDPG

 
 

Double learning in DDPG

 
 

Smoothing the targets used for policy updates

 
 
 
 

Delaying updates

 
 

SAC: Maximizing the expected return and entropy

 
 

Adding the entropy to the Bellman equations

 
 
 

Learning the action-value function

 
 
 
 

Learning the policy

 

Automatically tuning the entropy coefficient

 
 
 

PPO: Restricting optimization steps

 
 
 

Using the same actor-critic architecture as A2C

 
 
 
 

Batching experiences

 
 

Clipping the policy updates

 

Clipping the value function updates

 

Summary

 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage