4 Testing the DAG with causal constraints
This chapter covers
- Using d-separation to reason about how causality constrains conditional independence
- Using networkx and pgmpy to do d-separation analysis
- Refuting a causal DAG using conditional independence tests
- Refuting a causal DAG using Verma constraints
Causality in the data generating process induces constraints, such as conditional independence, on the joint probability distribution of the variables in that process. We saw a flavor of these constraints in the previous chapter in the form of the Markov property, how effects become independent of indirect causes given their direct causes. These constraints give us the ability to test our model against the data; if the causal DAG we build is correct, we should see evidence of these constraints in the data.
In this chapter, we’ll use statistical analysis of the data to test our causal DAG. Namely, we’ll try to refute our causal DAG; meaning we’ll look for ways the data suggests our causal DAG is wrong. In this chapter we learn to test our causal DAG using conditional independence tests and an extension of conditional independence called Verma constraints that we can test when variables in our causal DAG are not observed in data.
To start, we look at the concept of d-separation. D-separation tells us what conditional independence constraints should hold given our causal DAG, and it is the keystone of graphical causal inference analysis.