(The following post originally appeared in
Observational Epidemiology.)
You've probably heard about the Harlem Children's Zone, an impressive, even inspiring initiative to improve the lives of poor inner-city children through charter schools and community programs. Having taught in Watts and the Mississippi Delta in my pre-statistician days, this is an area of long-standing interest to me and I like a lot of the things I'm hearing about HCZ. What I don't like nearly as much is the reaction I'm seeing to the
research study by Will Dobbie and Roland G. Fryer, Jr. of Harvard. Here's Alex Tabarrok at Marginal Revolution with a
representative sample, "I don't know why anyone interested in the welfare of children would want to discourage this kind of experimentation."
Maybe I can provide some reasons.
First off, this is an observational study, not a randomized experiment. I think we may be reaching the limits of what analysis of observational data can do in the education debate and, given the importance and complexity of the questions, I don't understand why we aren't employing randomized trials to answer some of these questions once and for all.
More significantly I'm also troubled by the aliasing of data on the Promise Academies and by the fact that the authors draw a conclusion ("HCZ is enormously successful at boosting achievement in math and ELA in elementary school and math in middle school. The impact of being offered admission into the HCZ middle school on ELA achievement is positive, but less dramatic. High-quality schools or community investments coupled with high-quality schools drive these results, but community investments alone cannot.") that the data can't support.
In statistics, aliasing means combining treatments in such a way that you can't tell which treatment or combination of treatments caused the effect you observed. In this case the first treatment is the educational environment of the Promise Academies. The second is something called social norming.
When you isolate a group of students, they will quickly arrive at a consensus of what constitutes normal behavior. It is a complex and somewhat unpredictable process driven by personalities and random connections and any number of outside factors. You can however, exercise a great deal of control over the outcome by restricting the make-up of the group.
If we restricted students via an application process, how would we expect that group to differ from the general population and how would that affect the norms the group would settle on? For starters, all the parents would have taken a direct interest in their children's schooling.
Compared to the general population, the applicants will be much more likely to see working hard, making good grades, not getting into trouble as normal behaviors. The applicants (particularly older applicants) would be more likely to be interested in school and to see academic and professional success as a reasonable possibility because they would have made an active choice to move to a new and more demanding school. Having the older students committed to the program is particularly important because older children play a disproportionate role in the setting of social norms.
Dobbie and Fryer address the question of self-selection, "[R]esults from any lottery sample may lack external validity. The counterfactual we identify is for students who are already interested in charter schools. The effect of being offered admission to HCZ for these students may be different than for other types of students." In other words, they can't conclude from the data how well students would do at the Promise Academies if, for instance, their parents weren't engaged and supportive (a group effective eliminated by the application process).
But there's another question, one with tremendous policy implications, that the paper does not address: how well would the students who were accepted to HCZ have done if they were given the same amount of instruction * as they would have received from HCZ using public school teachers while being isolated from the general population? (There was a control group of lottery losers but there is no evidence that they were kept together as a group.)
Why is this question so important? Because we are thinking about spending an enormous amount of time, effort and money on a major overhaul of the education system when we don't have the data to tell us if what we'll spend will wasted or, worse yet, if we are to some extent playing a zero sum game.
Social norming can work both ways. If you remove all of the students whose parents are willing and able to go through the application process, the norms of acceptable behavior for those left behind will move in an ugly direction and the kids who started out with the greatest disadvantages would be left to bear the burden.
But we can answer these questions and make decisions based on solid, statistically sound data. Educational reform is not like climate change where observational data is our only reasonable option. Randomized trials are an option in most cases; they are not that difficult or expensive.
Until we get good data, how can we expect to make good decisions?
* Correction: There should have been a link here to
this post by Andrew Gelman.