chapter thirteen

13 Reinforcement learning

This chapter covers

Defining reinforcement learning
Implementing reinforcement learning

Humans learn from experiences (or at least should ). You didn’t get so charming by accident. Years of positive compliments as well as negative criticism have all helped shape who you are today. This chapter is about designing a machine-learning system driven by criticisms and rewards.

You learn what makes people happy, for example, by interacting with friends, family members, or even strangers, and you figure out how to ride a bike by trying out various muscle movements until riding clicks. When you perform actions, you’re sometimes rewarded immediately. Finding a good restaurant nearby might yield instant gratification, for example. At other times, the reward doesn’t appear right away; you might have to travel a long distance to find an exceptional place to eat. Reinforcement learning is about choosing the right actions, given any state—such as in figure 13.1, which shows a person making decisions to arrive at their destination.

Figure 13.1 A person navigating to reach a destination in the midst of traffic and unexpected situations is a problem setup for reinforcement learning.

13 Reinforcement learning

This chapter covers

Figure 13.1 A person navigating to reach a destination in the midst of traffic and unexpected situations is a problem setup for reinforcement learning.

13.1 Formal notions

13.1.1 Policy

13.1.2 Utility

13.2 Applying reinforcement learning

13.3 Implementing reinforcement learning

13.4 Exploring other applications of reinforcement learning

Summary