2 A primer on probabilistic generative modeling

 

This chapter covers

  • A primer on probability models
  • Computational probability with the pgmpy and Pyro libraries
  • Statistics for causality: data, populations, and models
  • Distinguishing between probability models and subjective Bayesianism

Chapter 1 made the case for learning how to code causal AI. This chapter will introduce some fundamentals we need to tackle causal modeling with probabilistic machine learning, which roughly refers to machine learning techniques that use probability to model uncertainty and simulate data. There is a flexible suite of cutting-edge tools for building probabilistic machine learning models. This chapter will introduce the concepts from probability, statistics, modeling, inference, and even philosophy that we will need in order to implement key ideas from causal inference with the probabilistic machine learning approach.

This chapter will not provide a mathematically exhaustive introduction to these ideas. I’ll focus on what is needed for the rest of this book and omit the rest. Any data scientist seeking causal inference expertise should not neglect the practical nuances of probability, statistics, machine learning, and computer science. See the chapter notes at https://www.altdeep.ai/p/causalaibook for recommended resources where you can get deeper introductions or review materials.

In this chapter, I’ll introduce two Python programming libraries for probabilistic machine learning:

2.1 Primer on probability

2.1.1 Random variables and probability

2.1.2 Probability distributions and distribution functions

2.1.3 Joint probability and conditional probability

2.1.4 The chain rule, the law of total probability, and Bayes Rule

2.1.5 Markovian assumptions and Markov kernels

2.1.6 Parameters

2.1.7 Canonical classes of probability distribution

2.1.8 Visualizing distributions

2.1.9 Independence and conditional independence

2.1.10 Expected value

2.2 Computational probability

2.2.1 The physical interpretation of probability

2.2.2 Random generation

2.2.3 Coding random processes

2.2.4 Monte Carlo simulation and expectation

2.2.5 Programming probabilistic inference