The key to running large experiments is driving your variable cost to zero. The best ways to do this are automation and designing enjoyable experiments.
Digital experiments can have dramatically different cost structures and this enables researchers to run experiments that were impossible in the past. More specifically, experiments generally have two main types of costs: fixed costs and variable costs. Fixed costs are costs that don’t change depending on how many participants you have. For example, in a lab experiment, fixed costs might be the cost of renting the space and buying furniture. Variable costs, on the other hand, change depending on how many participants you have. For example, in a lab experiment, variable costs might come from paying staff and participants. In general, analog experiments have low fixed costs and high variable costs, and digital experiments have high fixed costs and low variable costs (Figure 4.18). With appropriate design, you can drive the variable cost of your experiment all the way to zero, and this can create exciting research opportunities.
There are two main elements of variable cost—payments to staff and payments to participants—and each of these can be driven to zero using different strategies. Payments to staff stem from the work that research assistants do recruiting participants, delivering treatments, and measuring outcomes. For example, the analog field experiment of Schultz and colleagues (2007) on social norms and electricity usage required research assistants to travel to each home to deliver the treatment and read the electric meter (Figure 4.3). All of this effort by research assistants meant that adding a new household to the study would have added to the cost. On the other hand, for the digital field experiment of Restivo and van de Rijt (2012) on rewards in Wikipedia, researchers could add more participants at virtually no cost. A general strategy for reducing variable administrative costs is to replace human work (which is expensive) with computer work (which is cheap). Roughly, you can ask yourself: can this experiment run while everyone on my research team is sleeping? If the answer is yes, you’ve done a great job of automation.
The second main type of variable cost is payments to participants. Some researchers have used Amazon Mechanical Turk and other online labor markets to decrease the payments that are needed for participants. To drive variable costs all the way to zero, however, a different approach is needed. For a long time, researchers have designed experiments that are so boring they have to pay people to participate. But, what if you could create an experiment that people want to be in? This may sound far fetched, but I’ll give you an example below from my own work, and there are more examples in Table 4.4. Note that this approach to designing enjoyable experiments echoes some of the themes in Chapter 3 regarding designing more enjoyable surveys and in Chapter 5 regarding the design of mass collaboration. Thus, I think that participant enjoyment—what might also be called user experience—will be an increasingly important part of research design in the digital age.
Compensation | Citation |
---|---|
Website with health information | Centola (2010) |
Exercise program | Centola (2011) |
Free music | Salganik, Dodds, and Watts (2006); Salganik and Watts (2008); Salganik and Watts (2009b) |
Fun game | Kohli et al. (2012) |
Movie recommendations | Harper and Konstan (2015) |
If you want to create zero variable costs experiments you’ll want to ensure that everything is fully automated and that participants don’t require any payments. In order to show how this is possible, I’ll describe my dissertation research on the success and failure of cultural products. This example also shows that zero variable cost data is not just about doing things cheaper. Rather, it is about enabling experiments that would not be possible otherwise.
My dissertation was motivated by the puzzling nature of success for cultural products. Hit songs, best selling books, and blockbuster movies are much, much more successful than average. Because of this, the markets for these products are often called “winner-take-all” markets. Yet, at the same time, which particular song, book, or movie will become successful is incredibly unpredictable. The screenwriter William Goldman (1989) elegantly summed up lots of academic research by saying that, when it comes to predicting success, “nobody knows anything.” The unpredictability of winner-take-all markets made me wonder how much of success is a result of quality and how much is just luck. Or, expressed slightly differently, if we could create parallel worlds and have them all evolve independently, would the same songs become popular in each world? And, if not, what might be a mechanism that causes these differences?
In order to answer these questions, we—Peter Dodds, Duncan Watts (my dissertation advisor), and I—ran a series of online field experiments. In particular, we built a website called MusicLab where people could discover new music, and we used it for a series of experiments. We recruited participants by running banner ads on a teen-interest website (Figure 4.19) and through mentions in the media. Participants arriving at our website provided informed consent, completed a short background questionnaire, and were randomly assigned to one of two experimental conditions—independent and social influence. In the independent condition, participants made decisions about which songs to listen to, given only the names of the bands and the songs. While listening to a song, participants were asked to rate it after which they had the opportunity (but not the obligation) to download the song. In the social influence condition, participants had the same experience, except they could also see how many times each song had been downloaded by previous participants. Furthermore, participants in the social influence condition were randomly assigned to one of eight parallel worlds each of which evolved independently (Figure 4.20). Using this design, we ran two related experiments. In the first, we presented participants the songs in an unsorted grid, which provided them a weak signal of popularity. In the second experiment, we presented the songs in a ranked list, which provided a much stronger signal of popularity (Figure 4.21).
We found that the popularity of the songs differed across the worlds suggesting an important role of luck. For example, in one world the song “Lockdown” by 52Metro came in 1st, and in another world it came in 40th out of 48 songs. This was exactly the same song competing against all the same songs, but in one world it got lucky and in the others it did not. Further, by comparing results across the two experiments we found that social influence leads to more unequal success, which perhaps creates the appearance of predictability. But, looking across the worlds (which can’t be done outside of this kind of parallel worlds experiment), we found that social influence actually increased the unpredictability. Further, surprisingly, it was the songs of highest appeal that have the most unpredictable outcomes (Figure 4.22).
MusicLab was able to run at essentially zero variable cost because of the way that it was designed. First, everything was fully automated so it was able to run while I was sleeping. Second, the compensation was free music so there was no variable participant compensation cost. The use of music as compensation also illustrates how there is sometimes a trade-off between fixed costs and variable costs. Using music increased the fixed costs because I had to spend time securing permission from the bands and preparing reports for the bands about participants’ reaction to their music. But, in this case, increasing fixed costs in order to decrease variables costs was the right thing to do; that’s what enabled us to run an experiment that was about 100 times larger than a standard lab experiment.
Further, the MusicLab experiments show that zero variable cost does not have to be an end in itself; rather, it can be a means to running a new kind of experiment. Notice that we did not use all of our participants to run a standard social influence lab experiment 100 times. Instead, we did something different, which you could think of as switching from a psychological experiment to a sociological experiment (Hedström 2006). Rather than focusing on individual decision-making, we focused our experiment on popularity, a collective outcome. This switch to a collective outcome meant that we required about 700 participants to produce a single data point (there were 700 people in each of the parallel worlds). That scale was only possible because of the cost structure of the experiment. In general, if researchers want to study how collective outcomes arise from individual decisions, group experiments such as MusicLab are very exciting. In the past, they have been logistically difficult, but those difficulties are fading because of the possibility of zero variable cost data.
In addition to illustrating the benefits of zero variable cost data, the MusicLab experiments also show a challenge with this approach: high fixed costs. In my case, I was extremely lucky to be able to work with a talented web developer named Peter Hausel for about six months to construct the experiment. This was only possible because my advisor, Duncan Watts, had received a number of grants to support this kind of research. Technology has improved since we built MusicLab in 2004, and it would be much easier to build an experiment like this now. But, high fixed cost strategies are really only possible for researchers who can somehow cover those costs.
In conclusion, digital experiments can have dramatically different cost structures than analog experiments. If you want to run really large experiments, you should try to decrease your variable cost as much as possible and ideally all the way to 0. You can do this by automating the mechanics of your experiment (e.g., replacing human time with computer time) and designing experiments that people want to be in. Researchers who can design experiments with these features will be able to run new kinds of experiments that were not possible in the past.