chapter seven
7 Monte Carlo Tree Search: Searching with Reinforcement Learning Principles
This chapter covers
- Implementing tree data structures for planning problems.
- Implementing fundamental tree search algorithms (BFS, DFS).
- Solving capacitated vehicle route planning with Monte Carlo tree search.
- Comparison of five paradigms in solving sequential decision making problems.
Monte Carlo Tree Search combined with deep neural networks is what made it possible to finally beat a Go world champion. This approach is applicable to a wide range of decision-making processes, including those in real-world businesses.
Demis Hassabis, Nobel Prize Winner & Google DeepMind CEO
If reinforcement learning had a favorite child—the one that blends intuition, structure, and audacity—it would be Monte Carlo Tree Search (MCTS). In chapter 4, we mapped five major paradigms for tackling sequential decision-making.
Of the five, we’ve already gone deep on dynamic programming, multi-armed bandits, and temporal-difference learning. We also touched on Monte Carlo simulation, but only in passing. This chapter does two things at once: it closes the loop on all five paradigms and introduces a method that blends them in practice in a creative way that can inspire us on solving sequential decision-making problems.