Chapter 8. Reinforcement learning

 

This chapter covers

  • Defining reinforcement learning
  • Implementing reinforcement learning

Humans learn from past experiences (or, at least they should). You didn’t get so charming by accident. Years of positive compliments as well as negative criticism have all helped shape who you are today. This chapter is about designing a machine-learning system driven by criticisms and rewards.

You learn what makes people happy, for example, by interacting with friends, family, or even strangers, and you figure out how to ride a bike by trying out various muscle movements until riding just clicks. When you perform actions, you’re sometimes rewarded immediately. For example, finding a good restaurant nearby might yield instant gratification. Other times, the reward doesn’t appear right away, such as traveling a long distance to find an exceptional place to eat. Reinforcement learning is about making the right actions, given any state—such as in figure 8.1, which shows a person making decisions to arrive at their destination.

Figure 8.1. A person navigating to reach a destination in the midst of traffic and unexpected situations is a problem setup for reinforcement learning.

8.1. Formal notions

8.2. Applying reinforcement learning

8.3. Implementing reinforcement learning

8.4. Exploring other applications of reinforcement learning

8.5. Summary