Chapter 10. Interpretable reinforcement learning: Attention and relational models

This chapter covers

Implementing a relational reinforcement algorithm using the popular self-attention model
Visualizing attention maps to better interpret the reasoning of an RL agent
Reasoning about model invariance and equivariance
Incorporating double Q-learning to improve the stability of training

Hopefully by this point you have come to appreciate just how powerful the combination of deep learning and reinforcement learning is for solving tasks previously thought to be the exclusive domain of humans. Deep learning is a class of powerful learning algorithms that can comprehend and reason through complex patterns and data, and reinforcement learning is the framework we use to solve control problems.

Throughout this book we’ve used games as a laboratory for experimenting with reinforcement learning algorithms as they allow us to assess the ability of these algorithms in a very controlled setting. When we build an RL agent that learns to play a game well, we are generally satisfied our algorithm is working. Of course, reinforcement learning has many more applications outside of playing games; in some of these other domains, the raw performance of the algorithm using some metric (e.g., the accuracy percentage on some task) is not useful without knowing how the algorithm is making its decision.

10.1. Machine learning interpretability with attention and relational biases

Chapter 10. Interpretable reinforcement learning: Attention and relational models

This chapter covers

10.1. Machine learning interpretability with attention and relational biases

10.2. Relational reasoning with attention

10.3. Implementing self-attention for MNIST

10.4. Multi-head attention and relational DQN

10.5. Double Q-learning

10.6. Training and attention visualization

Summary