In the approaches covered so far in this book—observing behavior (chapter 2) and asking questions (chapter 3)—researchers collect data without intentionally and systematically changing the world. The approach covered in this chapter—running experiments—is fundamentally different. When researchers run experiments, they systematically intervene in the world to create data that is ideally suited to answering questions about cause-and-effect relationships.
Cause-and-effect questions are very common in social research, and examples include questions such as: Does increasing teacher salaries increase student learning? What is the effect of minimum wage on employment rates? How does a job applicant’s race affect her chance of getting a job? In addition to these explicitly causal questions, sometimes cause-and-effect questions are implicit in more general questions about maximization of some performance metric. For example, the question “What color should the donate button be on an NGO’s website?” is really lots of questions about the effect of different button colors on donations.
One way to answer cause-and-effect questions is to look for patterns in existing data. For example, returning to the question about the effect of teacher salaries on student learning, you might calculate that students learn more in schools that offer high teacher salaries. But, does this correlation show that higher salaries cause students to learn more? Of course not. Schools where teachers earn more might be different in many ways. For example, students in schools with high teacher salaries might come from wealthier families. Thus, what looks like an effect of teachers could just come from comparing different types of students. These unmeasured differences between students are called confounders, and, in general, the possibility of confounders wreaks havoc on researchers’ ability to answer cause-and-effect questions by looking for patterns in existing data.
One solution to the problem of confounders is to try to make fair comparisons by adjusting for observable differences between groups. For example, you might be able to download property tax data from a number of government websites. Then, you could compare student performance in schools where home prices are similar but teacher salaries are different, and you still might find that students learn more in schools with higher teacher pay. But there are still many possible confounders. Maybe the parents of these students differ in their level of education. Or maybe the schools differ in their closeness to public libraries. Or maybe the schools with higher teacher pay also have higher pay for principals, and principal pay, not teacher pay, is really what is increasing student learning. You could try to measure and adjust for these factors as well, but the list of possible confounders is essentially endless. In many situations, you just cannot measure and adjust for all the possible confounders. In response to this challenge, researchers have developed a number of techniques for making causal estimates from non-experimental data—I discussed some of them in chapter 2—but, for certain kinds of questions, these techniques are limited, and experiments offer a promising alternative.
Experiments enable researchers to move beyond the correlations in naturally occurring data in order to reliably answer certain cause-and-effect questions. In the analog age, experiments were often logistically difficult and expensive. Now, in the digital age, logistical constraints are gradually fading away. Not only is it easier to do experiments like those done in the past, it is now possible to run new kinds of experiments.
In what I’ve written so far I’ve been a bit loose in my language, but it is important to distinguish between two things: experiments and randomized controlled experiments. In an experiment, a researcher intervenes in the world and then measures an outcome. I’ve heard this approach described as “perturb and observe.” In a randomized controlled experiment a researcher intervenes for some people and not for others, and the researcher decides which people receive the intervention by randomization (e.g., flipping a coin). Randomized controlled experiments create fair comparisons between two groups: one that has received the intervention and one that has not. In other words, randomized controlled experiments are a solution to the problems of confounders. Perturb-and-observe experiments, however, involve only a single group that has received the intervention, and therefore the results can lead researchers to the wrong conclusion (as I’ll show soon). Despite the important differences between experiments and randomized controlled experiments, social researchers often use these terms interchangeably. I’ll follow this convention, but, at certain points, I’ll break the convention to emphasize the value of randomized controlled experiments over experiments without randomization and a control group.
Randomized controlled experiments have proven to be a powerful way to learn about the social world, and in this chapter, I’ll show you more about how to use them in your research. In section 4.2, I’ll illustrate the basic logic of experimentation with an example of an experiment on Wikipedia. Then, in section 4.3, I’ll describe the difference between lab experiments and field experiments and the differences between analog experiments and digital experiments. Further, I’ll argue that digital field experiments can offer the best features of analog lab experiments (tight control) and analog field experiments (realism), all at a scale that was not possible previously. Next, in section 4.4, I’ll describe three concepts—validity, heterogeneity of treatment effects, and mechanisms—that are critical for designing rich experiments. With that background, I’ll describe the trade-offs involved in the two main strategies for conducting digital experiments: doing it yourself or partnering with the powerful. Finally, I’ll conclude with some design advice about how you can take advantage of the real power of digital experiments (section 4.6.1) and describe some of the responsibility that comes with that power (section 4.6.2).