Brief Summary of Causal Inference Methods
Methods for causal inference can be classified into three categories by the nuisance models they need to learn:
- Treatment-model-based: It estimate $e(a|x) = P(A=a|X=x)$, the conditional distribution of receiving treatment $a$ given covariates $x$. A common example of this method is Inverse Propensity of Treatment Weighting (IPTW).
- Outcome-model-based: It estimate $Q(a,x)=E[Y|A=a,X=x])$, the conditional expectation of the outcome given treatment and covariates. For example, G-computation.
- Doubly Robust(DR): It estimate both $e(a|x)$ and $Q(a,x)$. The initial estimates of these models are combined together in a specific way to estimate the target causal estimand. Examples include AIPTW, TMLE, DML.
- “Nuisance models”
- “parametric” vs. “non-parametric”
Common Impression of DR Methods
The common impression for these three methods is the following:
- Treatment-model-based methods are consistent if the treatment model is consistent
- Outcome-model-based methods are consistent if the outcome model is consistent;
- DR methods are consistent if either the treatment or the outcome model is consistent.
- “consistent”
This view suggests that the first two methods have only one chance to produce a consistent estimate, whereas DR methods have two chances. Hence, DR methods seem better.
However, is it the whole story?
Not quite.
The statements above describe unbiasedness, which is only the minimum requirement for a good estimator. To make valid statistical inference, we need more: On top of unbiasedness, a valid estimator must also provide a reliable confidence interval (or p-values).
And this is where DR methods truly stand out. Let’s preview the key idea before going into the math: