Jan 17

Today in class we:

Described the commonly invoked SUTVA condition,
reviewed the Morgan and Rubin paper, and
saw some simple examples of adjustments for confounding by blocking on observable features.

Here again is a link to the Morgan and Rubin paper, and here is a link to the R script we will take a look at.

SUTVA stands for Stable Unit Treatment Value Assumption. It is typical to think about SUTVA as comprising two distinct conditions. These go by various names but here I will call them

Well-defined treatment levels. (This is what “stability” refers to.)
No interference. (This is what “unit” refers to.)

In this discussion we will assume a dichotomous treatment variable so we can refer to “treated” and “untreated” units, but everything works out fine for additional treatment levels.

Item one simply refers to the idea that “treated” needs to mean the same thing for everyone and be well-defined. To clarify by way of counter-example, suppose one wanted to undertake an analysis of the benefits of prenatal care on the birthweight of babies. Although one might have a recorded variable denoting “prenatal care”, if this recorded variable means quite different things across different individuals in the data set, then the treatment is not “stable” across units and the causal analysis can never get off the ground.

Item two stipulates that the potential outcomes of any individual are not impacted by the treatment assignment of other individuals. Formally, it permits us to write $y_i^1(\mathbf{d}) = y_i^1$ where $\mathbf{d}$ is a vector denoting the treatment assignment of all individuals (units) in the study. Again, by way of counter-example, consider a study of educational effectiveness comparing two teaching styles. Suppose that students in a single school were assigned either to classroom A, using one curriculum, or classroom B, using another. However, it is plausible that learning outcomes can be impacted by the number of students in a classroom. Clearly the outcomes of this study will depend on the relative class sizes, which is a function of the treatment assignment of all the students. This is a violation of the no-interference part of SUTVA. By contrast, if each student in the study went to a separate school and were assigned to classes with one of two fixed (stable!) curricula, then no interference is more reasonable.

For those of you readers with a statistics background, you can think of SUTVA as the causal analog of i.i.d., “independent and identically distributed”. Roughly, “no interference” is like “independence” and “stability” is like “identically distributed”.

Let’s turn now to the Morgan and Rubin paper. Last week we saw that randomization is adequate for ensuring that we can consistently estimate average treatment effects. Morgan and Rubin explore the intuition that we can improve on randomization if we can ensure that the treatment and control arms of our study are more similar to one another in certain relevant respects. In more detail, the intuition is that while randomization ensures that our two treatment arms will be the same, in all respects, on average, for any fixed treatment assignment, the two study arms might be observably different in certain ways known to be relevant to outcome distributions. This does not invalidate our estimator of the average treatment effect, but intuitively it suggests that it will be worse in some way, because it is seen to be making an “apples-to-oranges” rather than “apples-to-apples” comparison.

Essentially, Morgan and Rubin describe a constrained randomization, where treatment assignment is randomized subject to constraints on how different in terms of some baseline covariate(s) the two study arms can be. What they show in the paper is that constrained randomization leads to a lower variance estimator of the average treatment effect. In the limit, as we are able to measure more and more relevant variables, we can drive the variance of our estimator to zero as we more stringently restrict our randomization so that the treatment and control groups consist of prognostically identical individuals. Of course, in the limiting case we do not need randomization at all, but randomization is there to guarantee average balance on the unknown factors.

The Morgan and Rubin paper highlights how easy it is to slip between two seemingly distinct conceptions of causal inference, one based on the statistical language of randomization, independence, and average treatment effects, and another based on the notion of balance. The potential outcomes framework is rooted in the statistical conceptualization, while the notion of balance is ultimately rooted in the Method of Differences laid out by J.S. Mills. Mills posited that if two scenarios are identical in all respects but one, then the difference in outcome between these two scenarios can be attributed to that one respect in which they differ. The Method of Differences is nicely illustrated by twin studies where one twin receives the treatment and the other does not. No mention of randomization, no mention of average treatment effects, rather a notion of prognostic matching.

The point of contact between these two notions is variation of estimators. We can think about the Method of Differences as being a special case where our estimator of the ATE has zero variance. Consider the following thought experiment: a randomized study involving only many pairs of twins. By definition, if we randomly assign these subjects to treatment and control conditions, we would have a valid estimator of the average treatment effect. However, it would be much better to restrict ourselves only to those randomizations in which twins were split across treatment groups! This can be done by considering dummy variables denoting twin pairs and balancing on these $N/2$ covariates. Instead of $2^N$ valid treatment assignments we restrict ourselves to $2^{N/2}$ assignments. According to the Method of Differences, the estimator that averages over the treatment effects based on the twin pairs ought to have zero variance, because the only thing different between the twins is which one received the treatment (and the difference in outcomes does not change depending on which of the two twins ends up getting the treatment).

To explore this relationship between covariate balance and variance we consider an alternative to constrained randomization called blocking. Blocking similar refers to the idea of randomizing within blocks — subgroups of individuals who share common baseline covariate values. Estimates of the ATE are then obtained by aggregating the estimates within each block. Blocking is simpler than re-randomization, when it is feasible; Morgan and Rubin give some arguments (not entirely convincing IMO) in section 5.1 of the paper.

Using the provided R script we can explore the role of blocking in estimating the average treatment effect as we vary several factors: predictability of the outcome variable in terms of observable covariates, selection into treatment, and sample size. In another blog post I will provide a walkthrough of this script (maybe a video!), but this entry is already too long.

Share this:

Related

Leave a comment Cancel reply