11 Building a causal inference workflow

 

This chapter covers

  • Building a causal analysis workflow
  • Estimating causal effects with DoWhy
  • Estimating causal effects using machine learning methods
  • Causal inference with causal latent variable models

In chapter 10, I introduced a causal inference workflow, and in this chapter we’ll focus on building out this workflow in full. We’ll focus on one type of query in particular—causal effects—but the workflow generalizes to all causal queries.

We’ll focus on causal effect inference, namely estimation of average treatment effects (ATEs) and conditional average treatment effects (CATEs) because they are the most popular causal queries.

In chapter 1, I mentioned “the commodification of inference”—how modern software libraries enable us to abstract away the statistical and computational details of the inference algorithm. The first thing you’ll see in this chapter is how the DoWhy library “commodifies” causal inference, enabling us to focus at a high level on the causal assumptions of the algorithms and whether they are appropriate for our problem.

We’ll see the phenomenon at play again in an example that uses probabilistic machine learning to do causal effect inference on a causal generative model with latent variables. Here, we’ll see how deep learning with PyTorch provides another way to commodify inference.

11.1 Step 1: Select the query

Recall the causal inference workflow from chapter 10, shown again in figure 11.1.

11.2 Step 2: Build the model

11.3 Step 3: Identify the estimand

11.3.1 The backdoor adjustment estimand

11.3.2 The instrumental variable estimand

11.3.3 The front-door adjustment estimand

11.3.4 Choosing estimands and reducing “DAG anxiety”

11.3.5 When you don’t have identification

11.4 Step 4: Estimate the estimand

11.4.1 Linear regression estimation of the backdoor estimand