6.6.4 Making decisions in the face of uncertainty

Uncertainty need not lead to inaction.

The fourth and final area where I expect researchers to struggle is making decisions in the face of uncertainty. That is, after all the philosophizing and balancing, research ethics involves making decisions about what to do and what not to do. Unfortunately, these decisions often must be made based on incomplete information. For example, when designing Encore, researchers might have wished to know the probability that it would cause someone to be visited by the police. Or, when designing Emotional Contagion, researchers might have wished to know the probability that it could trigger depression in some participants. These probabilities were probably extremely low, but they were unknown before the research takes place. And, because neither project publicly tracked information about adverse events, these probabilities are still not generally known.

Uncertainties are not unique to social research in the digital age. When the Belmont Report described the systematic assessment of risks and benefits, it explicitly acknowledged these would be difficult to quantify exactly. These uncertainties, however, are more severe in the digital age, in part because we have less experience with this type of research and in part because of the characteristics of the research itself.

Given these uncertainties, some people seem to advocate for something like “better safe than sorry,” which is a colloquial version of the Precautionary Principle. While this approach appears reasonable—perhaps even wise—it can actually cause harm; it is chilling to research; and it causes people to take an excessively narrow view of the situation (Sunstein 2005). In order to understand the problems with the Precautionary Principle, let’s consider Emotional Contagion. The experiment was planned to involve about 700,000 people, and there was certainly some chance that people in the experiment would suffer harm. But there was also some chance that the experiment could yield knowledge that would be beneficial to Facebook users and to society. Thus, while allowing the experiment was a risk (as has been amply discussed), preventing the experiment would also have been a risk, because it could have produced valuable knowledge. Of course, the choice was not between doing the experiment as it occurred and not doing the experiment; there were many possible modifications to the design that might have brought it into a different ethical balance. However, at some point, researchers will have the choice between doing a study and not doing it, and there are risks in both action and inaction. It is inappropriate to focus only on the risks of action. Quite simply, there is no risk-free approach.

Moving beyond the Precautionary Principle, one important way to think about making decisions given uncertainty is the minimal risk standard. This standard attempts to benchmark the risk of a particular study against the risks that participants undertake in their daily lives, such as playing sports and driving cars (Wendler et al. 2005). This approach is valuable because assessing whether something meets the minimal risk standard is easier than assessing the actual level of risk. For example, in Emotional Contagion, before the study began, the researchers could have compared the emotional content of News Feeds in the experiment with that of other News Feeds on Facebook. If they had been similar, then the researchers could have concluded that the experiment met the minimal risk standard (M. N. Meyer 2015). And they could make this decision even if they didn’t know the absolute level of risk. The same approach could have been applied to Encore. Initially, Encore triggered requests to websites that were known to be sensitive, such as those of banned political groups in countries with repressive governments. As such, it was not minimal risk for participants in certain countries. However, the revised version of Encore—which only triggered requests to Twitter, Facebook, and YouTube—was minimal risk because requests to those sites are triggered during normal web browsing (Narayanan and Zevenbergen 2015).

A second important idea when making decisions about studies with unknown risk is power analysis, which allows researchers to calculate the sample size they will need to reliably detect an effect of a given size (Cohen 1988). If your study might expose participants to risk—even minimal risk—then the principle of Beneficence suggests that you should impose the smallest amount of risk needed to achieve your research goals. (Think back to the Reduce principle in chapter 4.) Even though some researchers have an obsession with making their studies as big as possible, research ethics suggests that researchers should make their studies as small as possible. Power analysis is not new, of course, but there is an important difference between the way that it was used in the analog age and how it should be used today. In the analog age, researchers generally did power analysis to make sure that their study was not too small (i.e., under-powered). Now, however, researchers should do power analysis to make sure that their study is not too big (i.e., over-powered).

The minimal risk standard and power analysis help you reason about and design studies, but they don’t provide you with any new information about how participants might feel about your study and what risks they might experience from participating in it. Another way to deal with uncertainty is to collect additional information, which leads to ethical-response surveys and staged trials.

In ethical-response surveys, researchers present a brief description of a proposed research project and then ask two questions:

(Q1) “If someone you cared about were a candidate participant for this experiment, would you want that person to be included as a participant?”: [Yes], [I have no preferences], [No]
(Q2) “Do you believe that the researchers should be allowed to proceed with this experiment?”: [Yes], [Yes, but with caution], [I’m not sure], [No]

Following each question, respondents are provided a space in which they can explain their answer. Finally, respondents—who could be potential participants or people recruited from a microtask labor markets (e.g., Amazon Mechanical Turk)—answer some basic demographic questions (Schechter and Bravo-Lillo 2014).

Ethical-response surveys have three features that I find particularly attractive. First, they happen before a study has been conducted, and therefore they can prevent problems before the research starts (as opposed to approaches that monitor for adverse reactions). Second, the respondents in ethical-response surveys are typically not researchers, and so this helps researchers see their study from the perspective of the public. Finally, ethical-response surveys enable researchers to pose multiple versions of a research project in order to assess the perceived ethical balance of different versions of the same project. One limitation, however, of ethical-response surveys is that it is not clear how to decide between different research designs given the survey results. But, despite these limitations, ethical-response surveys appear to be helpful; in fact, Schechter and Bravo-Lillo (2014) report abandoning a planned study in response to concerns raised by participants in an ethical-response survey.

While ethical-response surveys can be helpful for assessing reactions to proposed research, they cannot measure the probability or severity of adverse events. One way that medical researchers deal with uncertainty in high-risk settings is to perform staged trials—an approach that might be helpful in some social research. When testing the effectiveness of a new drug, researchers do not immediately jump to a large randomized clinical trial. Rather, they run two types of studies first. Initially, in a phase I trial, researchers are particularly focused on finding a safe dose, and these studies involve a small number of people. Once a safe dose has been determined, phase II trials assess the efficacy of the drug; that is, its ability to work in a best-case situation (Singal, Higgins, and Waljee 2014). Only after phase I and II studies have been completed is a new drug allowed to be assessed in a large randomized controlled trial. While the exact structure of staged trials used in the development of new drugs may not be a good fit for social research, when faced with uncertainty, researchers could run smaller studies explicitly focused on safety and efficacy. For example, with Encore, you could imagine the researchers starting with participants in countries with strong rule of law.

Together, these four approaches—the minimal risk standard, power analysis, ethical-response surveys, and staged trials—can help you proceed in a sensible way, even in the face of uncertainty. Uncertainty need not lead to inaction.