Key:
[, ] In the chapter, I was very positive about post-stratification. However, it does not always improve the quality of estimates. Construct a situation where can post-stratification can decrease the quality of estimates. (For a hint, see Thomsen (1973)).
[, , ] Design and conduct a non-probability survey on Amazon MTurk to ask about gun ownership (“Do you, or does anyone in your household, own a gun, rifle or pistol? Is that you or someone else in your household?”) and attitudes towards gun control (“What do you think is more important–to protect the right of Americans to own guns, or to control gun ownership?”).
[, , ] Goel and colleagues (2016) administered a non-probability-based survey consisting of 49 multiple-choice attitudinal questions drawn from the General Social Survey (GSS) and select surveys by the Pew Research Center on Amazon MTurk. They then adjust for the non-representativeness of data using model-based post-stratification (Mr. P), and compare the adjusted estimates with those estimated using probability-based GSS/Pew surveys. Conduct the same survey on MTurk and try to replicate Figure 2a and Figure 2b by comparing your adjusted estimates with the estimates from the most recent rounds of GSS/Pew (See Appendix Table A2 for the list of 49 questions).
[, , ] Many studies use self-report measures of mobile phone activity data. This is an interesting setting where researchers can compare self-reported behavior with logged behavior (see e.g., Boase and Ling (2013)). Two common behaviors to ask about are calling and texting, and two common time frames are “yesterday” and “in the past week.”
[, ] Schuman and Presser (1996) argue that question orders would matter for two types of relations between questions: part-part questions where two questions are at the same level of specificity (e.g. ratings of two presidential candidates); and part-whole questions where a general question follows a more specific question (e.g. asking “How satisfied are you with your work?” followed by “How satisfied are you with your life?”).
They further characterize two types of question order effect: consistency effects occur when responses to a later question are brought closer (than they would otherwise be) to those given to an earlier question; contrast effects occur when there are greater differences between responses to two questions.
[, ] Building on the work of Schuman and Presser, Moore (2002) describes a separate dimension of question order effect: additive and subtractive. While contrast and consistency effects are produced as a consequence of respondents’ evaluations of the two items in relation to each other, additive and subtractive effects are produced when respondents are made more sensitive to the larger framework within which the questions are posed. Read Moore (2002), then design and run a survey experiment on MTurk to demonstrate additive or subtractive effects.
[, ] Christopher Antoun and colleagues (2015) conducted a study comparing the convenience samples obtained from four different online recruiting sources: MTurk, Craigslist, Google AdWords and Facebook. Design a simple survey and recruit participants through at least two different online recruiting sources (they can be different sources from the four sources used in Antoun et al. (2015)).
[] YouGov, an internet-based market research firm, conducted online polls of a panel of about 800,000 respondents in the UK and used Mr. P. to predict the result of EU Referendum (i.e., Brexit) where the UK voters vote either to remain in or leave the European Union.
A detailed description of YouGov’s statistical model is here (https://yougov.co.uk/news/2016/06/21/yougov-referendum-model/). Roughly speaking, YouGov partitions voters into types based on 2015 general election vote choice, age, qualifications, gender, date of interview, as well as the constituency they live in. First, they used data collected from the YouGov panelists to estimate, among those who vote, the proportion of people of each voter type who intend to vote Leave. They estimate turnout of each voter type by using the 2015 British Election Study (BES) post-election face-to-face survey, which validated turnout from the electoral rolls. Finally, they estimate how many people there are of each voter type in the electorate based on latest Census and Annual Population Survey (with some addition information from the BES, YouGov survey data from around the general election, and information on how many people voted for each party in each constituency).
Three days before the vote, YouGov showed a two point lead for Leave. On the eve of voting, the poll showed too close to call (49-51 Remain). The final on-the-day study predicted 48/52 in favor of Remain (https://yougov.co.uk/news/2016/06/23/yougov-day-poll/). In fact, this estimate missed the final result (52-48 Leave) by four percentage points.
[, ] Write a simulation to illustrate each of the representation errors in Figure 3.1.
[, ] The research of Blumenstock and colleagues (2015) involved building a machine learning model that could use digital trace data to predict survey responses. Now, you are going to try the same thing with a different dataset. Kosinski, Stillwell, and Graepel (2013) found that Facebook likes can predict individual traits and attributes. Surprisingly, these predictions can be even more accurate than those of friends and colleagues (Youyou, Kosinski, and Stillwell 2015).
[] Toole et al. (2015) use call detail records (CDRs) from mobile phones to predict aggregate unemployment trends.