Researchers who study dolphins can’t ask them questions and are therefore forced to try to learn about dolphins by observing their behavior. Researchers who study humans, on the other hand, have it easier: their respondents can talk. Talking to people was an important part of social research in the past, and I expect that it will be in the future too.
In social research, talking to people typically takes two forms: surveys and in-depth interviews. Roughly speaking, research using surveys involves systematic recruitment of large numbers of participants, highly structured questionnaires, and the use of statistical methods to generalize from the participants to a larger population. Research using in-depth interviews, on the other hand, generally involves a small number of participants, semi-structured conversations, and results in a rich, qualitative description of the participants. Surveys and in-depth interviews are both powerful approaches, but surveys are much more impacted by the transition from the analog to the digital age. Therefore, in this chapter, I’ll focus on survey research.
As I’ll show in this chapter, the digital age creates many exciting opportunities for survey researchers to collect data more quickly and cheaply, to ask different kinds of questions, and to magnify the value of survey data with big data sources. The idea that survey research can be transformed by a technological change is not new, however. Around 1970, a similar change was taking place driven by a different communication technology: the telephone. Fortunately, understanding how the telephone changed survey research can help us imagine how the digital age will change survey research.
Survey research, as we recognize it today, began in the 1930s. During the first era of survey research, researchers would randomly sample geographic areas (such as city blocks) and then travel to those areas in order to have face-to-face conversations with people in randomly sampled households. Then, a technological development—the widespread diffusion of landline phones in wealthy countries—eventually led to the second era of survey research. This second era differed both in how people were sampled and in how conversations took place. In the second era, rather than sampling households in geographic areas, researchers randomly sampled telephone numbers in a procedure called random-digit dialing. And rather than traveling to talk to people face to face, researchers instead called them on the telephone. These might seem like small logistical changes, but they made survey research faster, cheaper, and more flexible. In addition to being empowering, these changes were also controversial because many researchers were concerned that these new sampling and interviewing procedures could introduce a variety of biases. But eventually, after lots of work, researchers figured out how to collect data reliably using random-digit dialing and telephone interviews. Thus, by figuring out how to successfully harness society’s technological infrastructure, researchers were able to modernize how they did survey research.
Now, another technological development—the digital age—will eventually bring us to a third era of survey research. This transition is being driven in part by the gradual decay of second-era approaches (B. D. Meyer, Mok, and Sullivan 2015). For example, for a variety of technological and social reasons, nonresponse rates—that is, the proportion of sampled people that do not participate in surveys—have been increasing for many years (National Research Council 2013). These long-term trends mean that the nonresponse rate can now exceed 90% in standard telephone surveys (Kohut et al. 2012).
On the other hand, the transition to a third era is also being driven in part by exciting new opportunities, some of which I’ll describe in this chapter. Although things are not yet settled, I expect that the third era of survey research will be characterized by non-probability sampling, computer-administered interviews, and the linkage of surveys to big data sources (table 3.1).
Sampling | Interviewing | Data environment | |
---|---|---|---|
First era | Area probability sampling | Face-to-face | Stand-alone surveys |
Second era | Random-digit dialing (RDD) probability sampling | Telephone | Stand-alone surveys |
Third era | Non-probability sampling | Computer-administered | Surveys linked to big data sources |
The transition between the second and third eras of survey research has not been completely smooth, and there have been fierce debates about how researchers should proceed. Looking back on the transition between the first and second eras, I think there is one key insight for us now: the beginning is not the end. That is, initially many second-era telephone-based methods were ad hoc and did not work very well. But, through hard work, researchers solved these problems. For example, researchers had been doing random-digit dialing for many years before Warren Mitofsky and Joseph Waksberg developed a random-digit dialing sampling method that had good practical and theoretical properties (Waksberg 1978; ???). Thus, we should not confuse the current state of third-era approaches with their ultimate outcomes.
The history of survey research shows that the field evolves, driven by changes in technology and society. There is no way to stop that evolution. Rather, we should embrace it, while continuing to draw wisdom from earlier eras, and that is the approach that I will take in this chapter. First, I will argue that big data sources will not replace surveys and that the abundance of big data sources increases—not decreases—the value of surveys (section 3.2). Given that motivation, I’ll summarize the total survey error framework (section 3.3) that was developed during the first two eras of survey research. This framework enables us to understand new approaches to representation—in particular, non-probability samples (section 3.4)—and new approaches to measurement—in particular, new ways of asking questions to respondents (section 3.5). Finally, I’ll describe two research templates for linking survey data to big data sources (section 3.6).