6 Improving agents' behaviors
In this chapter:
- You learn about improving policies when learning from feedback that is simultaneously sequential and evaluative.
- You develop algorithms for finding optimal policies in reinforcement learning environments when the transition and reward functions are unknown.
- You write code of agents that can go from random to optimal behavior using only their experiences and decision-making, and apply them to a variety of environments.
When it is obvious that the goals cannot be reached, don’t adjust the goals, adjust the action steps.
— Confucius Chinese teacher, editor, politician, and philosopher of the Spring and Autumn period of Chinese history