chapter two

2 A Primer on Probabilistic Generative Modeling

 

This chapter covers

  • A primer on probability models
  • Computational probability with the pgmpy and pyro libraries
  • Statistics for causality: data, populations, and models
  • Distinguishing between probability models and subjective Bayesianism

Chapter 1 made the case for learning how to code causal AI. This chapter will introduce some fundamentals we need to tackle causal modeling with probabilistic machine learning. Probabilistic machine learning roughly refers to machine learning techniques that use probability to model uncertainty and simulate data. There is a flexible suite of cutting-edge tools for building probabilistic machine learning models. This chapter introduces the concepts from probability, statistics, modeling, inference, and even philosophy that we will need to implement key ideas from causal inference with the probabilistic machine learning approach.

This is not a mathematically exhaustive introduction to these ideas. I focus on what is needed for the rest of this book and omit the rest. Any data scientist seeking causal inference expertise should not neglect the practical nuances of probability, statistics, machine learning, and computer science. See the book notes at www.altdeep.ai/causalAIbook for recommendations for resources where you can get deeper introductions or review materials.

Summary of programming libraries

2.1 Primer on probability

2.1.1 Random variables and probability

2.1.2 Probability distributions and distribution functions

2.1.3 Joint probability and conditional probability

2.1.4 The chain rule, the law of total probability, and Bayes Rule

2.1.5 Markovian assumptions and Markov kernels

2.1.6 Parameters

2.1.7 Canonical classes of probability distribution

2.1.8 Visualizing distributions

2.1.9 Independence and conditional independence

2.1.10 Expected value

2.2 Computational probability

2.2.1 The physical interpretation of probability

2.2.2 Random generation

2.2.3 Coding random processes

2.2.4 Monte Carlo simulation and expectation

2.2.5 Programming probabilistic inference

2.3 Data, populations, statistics, and models

2.3.1 Probability distributions as models for populations

2.3.2 From the observed data to the data generating process

2.3.3 Statistical tests for independence

2.3.4 Statistical estimation of model parameters

2.4 Determinism and subjective probability

2.4.1 Determinism

2.4.2 Subjective Probability

2.5 Summary