chapter seven

7 Monte Carlo Tree Search: Searching with Reinforcement Learning Principles

This chapter covers

Implementing tree data structures for planning problems.
Implementing fundamental tree search algorithms (BFS, DFS).
Solving capacitated vehicle route planning with Monte Carlo tree search.
Comparison of five paradigms in solving sequential decision making problems.

Monte Carlo Tree Search combined with deep neural networks is what made it possible to finally beat a Go world champion. This approach is applicable to a wide range of decision-making processes, including those in real-world businesses.

Demis Hassabis, Nobel Prize Winner & Google DeepMind CEO

If reinforcement learning had a favorite child—the one that blends intuition, structure, and audacity—it would be Monte Carlo Tree Search (MCTS). In chapter 4, we mapped five major paradigms for tackling sequential decision-making.

Of the five, we’ve already gone deep on dynamic programming, multi-armed bandits, and temporal-difference learning. We also touched on Monte Carlo simulation, but only in passing. This chapter does two things at once: it closes the loop on all five paradigms and introduces a method that blends them in practice in a creative way that can inspire us on solving sequential decision-making problems.

7 Monte Carlo Tree Search: Searching with Reinforcement Learning Principles

This chapter covers

7.1 Tree data structure and its applications in business

7.2 Fundamental tree search algorithms

7.3 Monte Carlo tree search theory and algorithm

7.4 Solving capacitated vehicle routing problem with Monte Carlo tree search

7.5 Summary