What is “Doubly Robust Method” Really About?

Brief Summary of Causal Inference Methods

Methods for causal inference can be classified into three categories by the nuisance models they need to learn:

Treatment-model-based: It estimate $e(a|x) = P(A=a|X=x)$, the conditional distribution of receiving treatment $a$ given covariates $x$. A common example of this method is Inverse Propensity of Treatment Weighting (IPTW).
Outcome-model-based: It estimate $Q(a,x)=E[Y|A=a,X=x])$, the conditional expectation of the outcome given treatment and covariates. For example, G-computation.
Doubly Robust(DR): It estimate both $e(a|x)$ and $Q(a,x)$. The initial estimates of these models are combined together in a specific way to estimate the target causal estimand. Examples include AIPTW, TMLE, DML.

“Nuisance models”
parametric, non-parametric, semi-parametric

Common Impression of DR Methods

The common impression for these three methods is the following:

Treatment-model-based methods are consistent if the treatment model is consistent
Outcome-model-based methods are consistent if the outcome model is consistent;
DR methods are consistent if either the treatment or the outcome model is consistent.
“consistent”

This view suggests that the first two methods have only one chance to produce a consistent estimate, whereas DR methods have two chances. Hence, DR methods seem better.

However, is it the whole story?

Not quite.

The statements above describe unbiasedness, which is only the minimum requirement for a good estimator. To make valid statistical inference, we need more: On top of unbiasedness, a valid estimator must also provide a reliable confidence interval (or p-values).

And this is where DR methods truly stand out. Let’s preview the key idea before going into the math: