Chapter 9. Learning by practice: reinforcement learning

This chapter covers

Defining a task for reinforcement learning
Building a learning agent for games
Collecting self-play experiences for training

I’ve probably read a dozen books on Go, all written by strong pros from China, Korea, and Japan. And yet I’m just an intermediate amateur player. Why haven’t I reached the level of these legendary players? Have I forgotten their lessons? I don’t think that’s it; I can practically recite Toshiro Kageyama’s Lessons in the Fundamentals of Go (Ishi Press, 1978) by heart. Maybe I just need to read more books....

I don’t know the full recipe for becoming a top Go star, but I know at least one difference between me and Go professionals: practice. A Go player probably clocks in five or ten thousand games before qualifying as a professional. Practice creates knowledge, and sometimes that’s knowledge that you can’t directly communicate. You can summarize that knowledge—that’s what makes it into Go books. But the subtleties get lost in the translation. If I expect to master the lessons I’ve read, I need to put in a similar level of practice.

Chapter 9. Learning by practice: reinforcement learning

This chapter covers

9.1. The reinforcement-learning cycle

9.2. What goes into experience?

9.3. Building an agent that can learn

9.4. Self-play: how a computer program practices

9.5. Summary