2 From effect to cause: Thomas Bayes and the first algorithm of learning
This chapter covers
- Thomas Bayes’ An Essay Towards Solving a Problem in the Doctrine of Chances (1763) and its geometric solution to inverse probability
- The logic of Bayes’ Theorem—how priors and likelihoods combine to produce rational posterior belief
- The foundational rules of probability—addition, complement, and multiplication—that make Bayesian inference work
- Bayesian reasoning in action, from medical diagnosis and recommendation systems to modern machine learning and AI
- Common misconceptions about Bayes’ Theorem and why they lead to flawed interpretation and misuse
How can one reason backward—from observed outcomes to the likelihood of an underlying cause? That was the central problem tackled by Thomas Bayes in his quietly revolutionary 1763 essay. At a time when probability focused primarily on predicting future events under known conditions, Bayes asked a different question: given what we observe, how should we reason about what we cannot directly see?
2.1 From observation to belief: how Bayesian reasoning begins
2.2 From prior to posterior: how Bayes’ Theorem works
2.2.1 The mathematical structure of Bayes’ Theorem
2.2.2 Before the formula: Bayes’ spatial intuition
2.2.3 Bayesian inference in real-world scenarios
2.2.4 The structure that makes inference possible
2.3 Building the machinery: the rules that power the theorem
2.3.1 Partitioning belief: the addition rule
2.3.2 Contrasting outcomes: the complement rule
2.3.3 Expressing likelihoods: odds and expected value
2.3.4 Linking outcomes: the logic of the multiplication rule
2.3.5 Independent repetition: compound events
2.3.6 Likelihood from repetition: binomial probability
2.4 Applications in machine learning and AI
2.4.1 Naive Bayes classifiers
2.4.2 Bayesian networks
2.4.3 Bayesian optimization
2.4.4 Thompson sampling
2.4.5 Bayesian A/B testing
2.4.6 The addition and multiplication probability rules in action
2.5 Why it still matters: everyday and not-so-everyday uses
2.5.1 Forecasting under uncertainty
2.5.2 Reinforcement learning and belief updating
2.5.3 Bayesian decision analysis
2.5.4 Markov Chain Monte Carlo (MCMC) and approximate inference
2.6 Where Bayesian inference goes wrong
2.7 Summary