Given these ten characteristics of big data sources and the inherent limitations of even perfectly observed data, what kind of research strategies are useful? That is, how can we learn when we don’t ask questions and don’t run experiments? It might seem that just watching people could not lead to interesting research, but that’s not the case.
I see three main strategies for learning from observational data: counting things, forecasting things, and approximating experiments. I’ll describe each of these approaches—which could be called “research strategies” or “research recipes”—and I’ll illustrate them with examples. These strategies are neither mutually exclusive or exhaustive, but they do capture a lot of research with observational data.
To foreshadow the claims that follow, counting things is most important when we are empirically adjudicating between predictions from different theories. Forecasting, and especially nowcasting, can be useful for policy makers. Finally, big data increases our ability to make causal estimates from observational data.