chapter eight

8 Curiosity-Driven Exploration

In this chapter

Understand the sparse reward problem
Understand how curiosity can serve as an intrinsic reward
Play Super Mario Bros. from OpenAI Gym
Implement an intrinsic curiosity module in PyTorch
Train a deep Q-network agent to play Super Mario Bros. using just curiosity

Code for this chapter is at this book’s GitHub repository under Chapter 8:

https://github.com/DeepReinforcementLearning/DeepReinforcementLearningInAction/

The fundamental reinforcement learning algorithms we have studied so far such as Deep Q-learning and policy gradient methods are very powerful techniques in a lot of situations, but fail dramatically in other environments. Google DeepMind pioneered the field of deep reinforcement learning back in 2013 when they used Deep Q-learning to train an agent to play multiple Atari games at superhuman performance levels. But the performance of the agent was highly variable across different types of games. At one extreme their DQN agent played the Atari game Breakout vastly better than a human, but at the other extreme, the DQN was much worse than a human at playing Montezuma’s Revenge where it could not even pass the first level.

Figure 8.1: Screenshot from the Montezuma’s Revenge Atari game. The player must navigate through obstacles to get a key before any rewards are received.

want to learn more ?

8 Curiosity-Driven Exploration

In this chapter

Figure 8.1: Screenshot from the Montezuma’s Revenge Atari game. The player must navigate through obstacles to get a key before any rewards are received.

want to learn more ?

8.1 Tackling Sparse Rewards with Predictive Coding

8.2 Inverse Dynamics Prediction

8.3 Setting up Super Mario Bros.

8.4 Preprocessing and the Q-network

8.5 Setting up the Q-network and Policy Function

8.6 Intrinsic Curiosity Module

8.7 Alternative Intrinsic Reward Mechanisms

8.8 Summary