[, ] In the chapter, I was very positive about post-stratification. However, this does not always improve the quality of estimates. Construct a situation where post-stratification can decrease the quality of estimates. (For a hint, see Thomsen (1973).)
[, , ] Design and conduct a non-probability survey on Amazon Mechanical Turk to ask about gun ownership and attitudes toward gun control. So that you can compare your estimates to those derived from a probability sample, please copy the question text and response options directly from a high-quality survey such as those run by the Pew Research Center.
[, , ] Goel and colleagues (2016) administered 49 multiple-choice attitudinal questions drawn from the General Social Survey (GSS) and select surveys by the Pew Research Center to non-probability sample of respondents drawn from Amazon Mechanical Turk. They then adjusted for the non-representativeness of data using model-based post-stratification and compared their adjusted estimates with those from the probability-based GSS and Pew surveys. Conduct the same survey on Amazon Mechanical Turk and try to replicate figure 2a and figure 2b by comparing your adjusted estimates with the estimates from the most recent rounds of the GSS and Pew surveys. (See appendix table A2 for the list of 49 questions.)
[, , ] Many studies use self-reported measures of mobile phone use. This is an interesting setting in which researchers can compare self-reported behavior with logged behavior (see e.g., Boase and Ling (2013)). Two common behaviors to ask about are calling and texting, and two common time frames are “yesterday” and “in the past week.”
[, ] Schuman and Presser (1996) argue that question orders would matter for two types of questions: part-part questions where two questions are at the same level of specificity (e.g., ratings of two presidential candidates); and part-whole questions where a general question follows a more specific question (e.g., asking “How satisfied are you with your work?” followed by “How satisfied are you with your life?”).
They further characterize two types of question order effect: consistency effects occur when responses to a later question are brought closer (than they would otherwise be) to those given to an earlier question; contrast effects occur when there are greater differences between responses to two questions.
[, ] Building on the work of Schuman and Presser, Moore (2002) describes a separate dimension of question order effect: additive and subtractive effects. While contrast and consistency effects are produced as a consequence of respondents’ evaluations of the two items in relation to each other, additive and subtractive effects are produced when respondents are made more sensitive to the larger framework within which the questions are posed. Read Moore (2002), then design and run a survey experiment on MTurk to demonstrate additive or subtractive effects.
[, ] Christopher Antoun and colleagues (2015) conducted a study comparing the convenience samples obtained from four different online recruiting sources: MTurk, Craigslist, Google AdWords and Facebook. Design a simple survey and recruit participants through at least two different online recruiting sources (these sources can be different from the four sources used in Antoun et al. (2015)).
[] In an effort to predict the results of the 2016 EU Referendum (i.e., Brexit), YouGov—an Internet-based market research firm—conducted online polls of a panel of about 800,000 respondents in the United Kingdom.
A detailed description of YouGov’s statistical model can be found at https://yougov.co.uk/news/2016/06/21/yougov-referendum-model/. Roughly speaking, YouGov partitioned voters into types based on 2015 general election vote choice, age, qualifications, gender, and date of interview, as well as the constituency in which they lived. First, they used data collected from the YouGov panelists to estimate, among those who voted, the proportion of people of each voter type who intended to vote Leave. They estimated the turnout of each voter type by using the 2015 British Election Study (BES), a post-election face-to-face survey, which validated turnout from the electoral rolls. Finally, they estimated how many people there were of each voter type in the electorate, based on latest Census and Annual Population Survey (with some addition information from other data sources).
Three days before the vote, YouGov showed a two-point lead for Leave. On the eve of voting, the poll indicated that the result was too close to call (49/51 Remain). The final on-the-day study predicted 48/52 in favor of Remain (https://yougov.co.uk/news/2016/06/23/yougov-day-poll/). In fact, this estimate missed the final result (52/48 Leave) by four percentage points.
[, ] Write a simulation to illustrate each of the representation errors in figure 3.2.
[, ] The research of Blumenstock and colleagues (2015) involved building a machine learning model that could use digital trace data to predict survey responses. Now, you are going to try the same thing with a different dataset. Kosinski, Stillwell, and Graepel (2013) found that Facebook likes can predict individual traits and attributes. Surprisingly, these predictions can be even more accurate than those of friends and colleagues (Youyou, Kosinski, and Stillwell 2015).
[] Toole et al. (2015) used call detail records (CDRs) from mobile phones to predict aggregate unemployment trends.