Double robustness

Suppose we are interested in estimating the average treatment effect (ATE), defined in potential-outcome notation as

$\tau = E(E(Y^1 \mid X) - E(Y^0 \mid X))$

where the outer expectation is over $X$ . Assuming strong ignorability, so that

$E(Y \mid D = 1, X) = E(Y^1 \mid X)$

and

$E(Y \mid D = 0, X) = E(Y^0 \mid X),$

the ATE can be written as

$\tau = E(E(Y \mid D = 1, X) - E(Y \mid D = 0, X))$

where the outer expectation is over covariates $X$ . Likewise, recall that the ATE can also be written as

$\tau = E\left(\frac{YD}{E(D =1 \mid X)} - \frac{Y(1-D)}{E(D =0 \mid X)}\right)$

where the outer expectation is over the pair $(D,X)$ .

Now, consider the estimand

$E(f(X)) + E\left(\frac{YD}{g(X)}\right) - E\left(\frac{f(X)D}{g(X)}\right)$

where the expectation is taken over the joint distribution of $(Y,D,X)$ and $f(\cdot)$ and $g(\cdot)$ are fixed functions. We will consider what happens for particular specifications of $f(\cdot)$ and $g(\cdot)$ . In particular, we will consider two cases.

Suppose $f(X) = E(Y \mid D = 1, X)$ . In this case, our estimand becomes

$E(E(Y \mid D = 1, X)) + E\left(\frac{YD}{g(X)}\right) - E\left(\frac{E(Y \mid D = 1, X)D}{g(X)}\right).$

By iterated expectation, the middle term can be rewritten as

$E\left(\frac{E(Y \mid D = 1, X)D}{g(X)}\right),$

which we see will cancel with the third term, leaving only the first term, which is equivalent to $E(E(Y^1 \mid X))$ .

Suppose $g(X) = E(D = 1 \mid X)$ . In this case, our estimand becomes

$E(f(X)) + E\left(\frac{YD}{E(D = 1 \mid X)}\right) - E\left(\frac{f(X)D}{E(D = 1 \mid X)}\right).$

By iterated expectation, the third term becomes

$E\left(\frac{f(X)E(D=1 \mid X)}{E(D = 1 \mid X)}\right) =E(f(X)),$

which cancels with the first term, leaving only the second term, which in this case is equivalent to $E(E(Y^1 \mid X))$ .

If both of the above conditions hold, one gets the same result — that this estimand is equivalent to $E(E(Y^1 \mid X))$ — but only one of them is necessary. Applying similar reasoning to $E(E(Y^0 \mid X))$ allows us to estimate the ATE.

If neither $f(X) = E(Y \mid D = 1, X)$ nor $g(X) = E(D = 1 \mid X)$ , then one is of course simply out of luck (i.e., won’t be able to estimate ATE via this estimand).

Share this:

Related

Leave a comment Cancel reply