Nrog uas tsis yog-yuav qhov muaj tseeb kuaj, tes taw hnyav li yuav kho distortions tshwm sim los ntawm lub assumed zauv txheej txheem.
Nyob rau hauv tib txoj kev uas soj ntsuam nqa cov lus teb los ntawm qhov muaj tseeb kuaj, lawv kuj nqa cov lus teb los ntawm cov-yuav qhov muaj tseeb kuaj. Piv txwv li, raws li ib tug lwm txoj rau lub CPS, xav txog tej yam uas koj muab tso rau banner tawm rau txhiab tus websites kom txhij koom rau ib daim ntawv ntsuam xyuas los laij rau tus nyiaj poob hauj lwm tus nqi. Lawm, koj yuav tsis ntseeg hais tias cov yooj yim phem ntawm koj cov hnoos qeev yuav tau ib tug zoo kwv yees ntawm cov nyiaj poob hauj nqi. Koj tsis ntseeg tau pib yog tej zaum vim hais tias koj xav hais tias ib txhia neeg feem ntau yuav mus ua kom tiav koj daim ntawv ntsuam xyuas dua lwm tus. Piv txwv li, cov neeg uas tsis txhob siv ib tug ntau lub sij hawm nyob rau hauv lub web yuav tsis mus ua kom tiav koj daim ntawv ntsuam xyuas.
Raws li peb pom nyob rau hauv lub xeem seem, Txawm li cas los, yog hais tias peb paub yuav ua li cas cov qauv xaiv-raws li peb ua nrog ntau yam kuaj-ces peb yuav tau kho distortions tshwm sim los ntawm cov zauv txheej txheem. Tu siab, thaum ua hauj lwm nrog cov-yuav qhov muaj tseeb kuaj, peb tsis paub yuav ua li cas cov qauv xaiv. Tab sis, peb yuav ua tau cov kev xav txog cov zauv txheej txheem thiab ces thov weighting nyob rau hauv tib txoj kev. Yog hais tias cov kev xav no yeej muaj tseeb, ces tus weighting yuav daws distortions tshwm sim los ntawm cov zauv txheej txheem.
Piv txwv li, xav txog tej yam uas nyob rau hauv cov lus teb rau koj banner tawm, koj recruited 100,000 cov neeg. Txawm li cas los, koj tsis ntseeg hais tias cov 100,000 cov neeg yog ib tug yooj yim random qauv ntawm American cov neeg laus. Nyob rau hauv qhov tseeb, thaum koj piv koj cov neeg mus rau lub US pejxeem, koj nrhiav tau hais tias cov neeg los ntawm ib co kas (eg, New York) yog tshaj-sawv cev thiab hais tias cov neeg los ntawm ib co kas (eg, Alaska) yog nyob rau hauv-sawv cev. Yog li, cov nyiaj poob hauj nqi ntawm koj cov hnoos qeev yog yuav muaj ib tug phem kwv yees ntawm cov nyiaj poob hauj nqi nyob rau hauv lub hom phiaj pejxeem.
Ib txoj kev kho tau tej kev distortion uas tau tshwm sim nyob rau hauv cov zauv txheej txheem yog muab tes taw hnyav li rau txhua tus neeg; qis tes taw hnyav li rau cov neeg los ntawm lub xeev uas tshaj-sawv cev nyob rau hauv tus qauv (xws li, New York) thiab ntau dua tes taw hnyav li rau cov neeg los ntawm lub xeev uas nyob rau hauv-sawv cev nyob rau hauv tus qauv (xws li, Alaska). Dua, lub qhov ceeb thawj rau txhua tus neeg teb muaj feem xyuam rau lawv cov loj heev nyob rau hauv koj cov hnoos qeev txheeb ze mus rau lawv heev nyob rau hauv lub US pejxeem. Qhov no weighting txoj kev no yog hu ua post-stratification, thiab lub tswv yim ntawm uas yuav tsum hais qhia rau koj qhov piv txwv nyob rau hauv Section 3.4.1 qhov chaw uas cov neeg los ntawm Rhode Island tau muab tsawg hnyav tshaj cov neeg los ntawm California. Post-stratification yuav tsum tau hais tias koj paub txaus los muab koj cov neeg mus rau hauv pab pawg thiab paub kev faib ua feem ntawm lub hom phiaj cov pejxeem nyob rau hauv txhua pab pawg neeg.
Txawm hais tias weighting ntawm qhov yuav tshwm qauv thiab cov uas tsis yog-yuav qhov muaj tseeb qauv yog tib yam lej (saib kev appendix), lawv ua hauj lwm zoo nyob rau hauv tej yam. Yog hais tias tus kws tshawb fawb muaj ib tug zoo meej ntau yam qauv (ie, tsis muaj kev pab them nqi ua yuam kev thiab tsis muaj uas tsis yog-teb), ces weighting yuav tsim tuaj leejtwg tog kev kwv yees rau tag nrho cov zoo nyob rau hauv tag nrho cov mob. Qhov no muaj zog theoretical guarantee yog vim li cas pab tswvyim ntawm qhov muaj tseeb kuaj nrhiav rau lawv kom txaus nyiam. Nyob rau lwm cov tes, weighting uas tsis yog-yuav qhov muaj tseeb kuaj yuav tsuas ua kawm kwv yees rau tag nrho cov qhov zoo yog tias cov lus teb propensities yog tib yam rau txhua leej txhua tus nyob rau hauv txhua pab pawg neeg. Nyob rau hauv lwm yam lus, xav rov qab mus rau peb tus yam ntxwv, siv post-stratification yuav tsim kawm kwv yees yog tias txhua leej txhua tus nyob rau hauv New York muaj tib lub qhov muaj tseeb ntawm kev koom tes nrog thiab txhua leej txhua tus nyob rau hauv Alaska muaj tib lub qhov muaj tseeb ntawm kev koom tes nrog thiab thiaj li nyob. Qhov no assumption yog hu ua tus homogeneous-teb-propensities-hauv-pawg assumption, thiab nws plays ib tug tseem ceeb luag hauj lwm nyob rau hauv kev paub txog yog tias post-stratification yuav ua hauj lwm zoo nrog non-yuav qhov muaj tseeb kuaj.
Tu siab, nyob rau hauv peb tus yam ntxwv, tus homogeneous-teb-propensities-hauv-pawg assumption nkawd tsis zoo li yuav tsis muaj tseeb. Uas yog, no mas, tsis zoo li txhua leej txhua tus nyob rau hauv Alaska muaj tib lub qhov muaj tseeb ntawm kev ua nyob rau hauv koj daim ntawv ntsuam xyuas. Tab sis, muaj peb qho tseem ceeb cov ntsiab lus yuav tau nco ntsoov txog post-stratification, tag nrho cov uas ua rau nws zoo li ntau pheej.
Ua ntej, homogeneous-teb-propensities-hauv-pawg assumption yuav ntau plausible li tus xov tooj ntawm pawg nce. Thiab, soj ntsuam tsis yog tas rau tej pawg cia li raws li ib tug tib geographic dimension. Piv txwv li, peb yuav ua pawg raws li lub xeev, hnub nyoog, poj niam txiv, thiab theem ntawm kev kawm ntawv. Nws nkawd ntau plausible hais tias muaj yog homogeneous teb propensities hauv cov pab pawg ntawm 18-29, poj niam, kawm ntawv qib siab tiav nyob rau hauv Alaska tshaj hauv cov pab pawg ntawm tag nrho cov neeg nyob rau hauv Alaska. Yog li, raws li tus xov tooj ntawm pawg siv rau post-stratification nce, tus xav xav tau los txhawb nws ua ntau tsim nyog. Muab no qhov tseeb, nws mas, zoo li ib tug soj ntsuam yuav xav los ua ib tug lossis loj tus naj npawb ntawm ib pawg rau post-stratification. Tab sis, raws li tus xov tooj ntawm pawg nce, soj ntsuam khiav mus rau hauv ib tug txawv teeb meem: cov ntaub ntawv sparsity. Yog hais tias muaj tsuas yog ib tug me me muaj pes tsawg tus ntawm cov neeg nyob rau hauv txhua pab pawg neeg, ces tus kwv yuav tsis paub tseeb, thiab nyob rau hauv cov huab cov ntaub ntawv nyob qhov twg muaj ib pab pawg neeg uas twb tsis muaj neeg teb, ces post-stratification kiag li lov. Muaj ob txoj kev tawm ntawm no xam qhovkev nro ntawm lub plausibility ntawm homogeneous- teb-propensity-hauv-pawg assumption thiab qhov kev thov rau tsim nyog coj ntau thiab tsawg pab nyob rau hauv txhua pab pawg neeg. Ib tug mus kom ze yog tsiv mus nyob rau ib tug ntau sophisticated statistical cov qauv rau kev xam xyuas tes taw hnyav li thiab lwm yam yog mus sau ib tug loj, ntau ntau haiv neeg cov qauv, uas yuav pab kom tsim nyog qauv ntau thiab tsawg pab nyob rau hauv txhua pab pawg neeg. Thiab, tej zaum soj ntsuam ua ob leeg, raws li kuv mam li piav qhia txog nyob rau hauv kom meej hauv qab no.
Ib tug thib ob xav thaum ua hauj lwm nrog post-stratification ntawm non-yuav qhov muaj tseeb kuaj yog hais tias lub homogeneous-teb-propensity-hauv-pawg assumption yog twb nquag ua thaum cais yuav qhov muaj tseeb kuaj. Yog vim li cas hais tias qhov no assumption yog yuav tsum tau rau qhov muaj tseeb kuaj nyob rau hauv kev xyaum yog tias muaj tseeb kuaj tau uas tsis yog-teb, thiab feem ntau txoj kev rau kho rau uas tsis yog-teb yog post-stratification raws li tau piav saum toj no. Ntawm cov hoob kawm, cia li vim hais tias muaj ntau yam kev soj ntsuam ua ib tug tej yam assumption tsis tau txhais hais tias koj yuav tsum ua ib yam nkaus thiab. Tab sis, nws txhais tau hais tias thaum muab piv uas tsis yog-yuav qhov muaj tseeb kuaj rau ntau yam kuaj nyob rau hauv kev xyaum, peb yuav tsum nco ntsoov hais tias ob qho tag nrho nyob ntawm seb cov kev xav thiab pab ntaub ntawv nyob rau hauv thiaj li yuav tsim kev kwv yees. Nyob rau hauv feem ntau muaj tiag nqis, muaj tsuas yog tsis muaj assumption-free mus kom ze rau inference.
Thaum kawg, yog tias koj hmov tshua txog ib tug kwv yees nyob rau hauv particular-nyob rau hauv peb tus yam ntxwv nyiaj poob hauj lwm tus nqi-ces koj xav tau ib tug mob weaker tshaj homogeneous-teb-propensity-hauv-pawg assumption. Tshwj xeeb, koj tsis tas yuav mus xav hais tias txhua leej txhua tus muaj tib lub teb propensity, koj tsuas yuav tau pov plob hais tias muaj yog tsis muaj correlation ntawm teb propensity thiab nyiaj poob hauj lwm tus nqi nyob rau hauv txhua pab pawg neeg. Ntawm cov hoob kawm, txawm no weaker mob yuav tsis tuav nyob rau hauv tej lub sijhawm. Piv txwv li, xav txog tej yam kwv yees cov kev faib ua feem ntawm cov neeg Asmeskas uas ua hauj lwm pub dawb. Yog hais tias cov neeg uas ua hauj lwm pub dawb yog feem ntau yuav pom zoo yuav tsum tau nyob rau hauv ib daim ntawv ntsuam, ces soj ntsuam yuav systematically tshaj-kwv yees tus nqi ntawm cov haujlwm pab dawb, txawm yog hais tias lawv ua post-stratification kev kho me ntsis, ib tug tshwm sim uas tau tau pom hais tias empirically los ntawm Abraham, Helms, and Presser (2009) .
Raws li kuv hais ua ntej lawm, uas tsis yog-yuav qhov muaj tseeb kuaj yog saib nrog zoo skepticism los ntawm kev tshawb fawb, nyob rau hauv ib feem vim hais tias ntawm lawv cov luag hauj lwm nyob rau hauv ib co ntawm cov feem ntau embarrassing swb nyob rau hauv thaum ntxov hnub ntawm daim ntawv ntsuam xyuas kev tshawb fawb. Ib tug ntshiab piv txwv ntawm yuav ua li cas nyob deb peb tau tuaj uas tsis yog-yuav qhov muaj tseeb kuaj yog cov kev tshawb fawb ntawm Wei Wang, David Rothschild, Sharad Goel, thiab Andrew Gelman uas raug zoo tu qab tso rau hauv lub sij hawm ntawm lub 2012 US kev xaiv tsa siv ib tug uas tsis yog-yuav qhov muaj tseeb qauv ntawm American Xbox neeg siv -ib decidedly uas tsis yog-random qauv ntawm cov neeg Asmeskas (Wang et al. 2015) . Cov soj ntsuam ntawm recruited cov neeg los ntawm cov XBox gaming system, thiab raws li tej zaum koj yuav xav tias yuav, lub Xbox qauv skewed txiv neej thiab skewed hluas: 18 - 29 xyoos ua tau 19% ntawm cov electorate tab sis 65% ntawm cov Xbox qauv thiab cov txiv neej ua li 47% ntawm lub electorate thiab 93% ntawm cov Xbox qauv (Daim duab 3.4). Vim hais tias ntawm cov muaj zog demographic biases, cov nqaij nyoos Xbox cov ntaub ntawv yog ib tug neeg pluag qhia ntawm kev xaiv tsa rov qab. Nws kwv yees ib tug muaj zog yeej rau Mitt Romney tshaj Barack Obama. Ib zaug ntxiv, qhov no yog lwm cov piv txwv ntawm cov kev piam sij ntawm nyoos, unadjusted uas tsis yog-yuav qhov muaj tseeb kuaj thiab yog reminiscent ntawm cov Zajlus zom fiasco.
Txawm li cas los, Wang thiab lug txhawb cov miv twb paub txog tej teeb meem no thiab sim nqa lub neeg rau kho rau cov zauv txheej txheem. Nyob rau hauv kev, lawv siv ib tug ntau sophisticated daim ntawv ntawm cov post-stratification kuv hais rau koj hais txog. Nws yog ib nqi kawm ib me ntsis ntxiv txog lawv mus kom ze vim hais tias nws yuav ua qhov txog post-stratification, thiab cov version Wang thiab lug txhawb cov miv siv yog ib yam ntawm cov feem ntau exciting le caag los weighting uas tsis yog-yuav qhov muaj tseeb kuaj.
Nyob rau hauv peb piv txwv yooj yim txog kwv yees nyiaj poob hauj lwm nyob rau hauv Section 3.4.1, peb muab faib rau cov pejxeem mus rau hauv pab pawg raws li lub xeev ntawm qhov chaw nyob. Rau hauv sib piv, Wang thiab lug txhawb cov miv faib rau cov pejxeem mus rau hauv mus rau hauv 176.256 pawg txhais los ntawm: poj niam txiv (2 pawg), haiv neeg (4 pawg), muaj hnub nyoog (4 pawg), kev kawm ntawv (4 pawg), lub xeev (51 pawg), tog ID (3 pawg), ideology (3 pawg) thiab 2008 pov npav (3 pawg). Muaj ntau pab pawg, tus neeg tshawb fawb vam hais tias nws yuav nce yuav hais tias nyob rau hauv txhua pab pawg neeg, teb propensity yog uncorrelated nrog them nyiaj yug rau Obama. Tom ntej no, es txua ib tug neeg-theem taw hnyav li, raws li peb tau nyob rau hauv peb tus yam ntxwv, Wang thiab lug txhawb cov miv siv ib txoj qauv los laij rau tus kev faib ua feem ntawm cov neeg nyob rau hauv txhua pab pawg neeg uas yuav pov npav rau Obama. Thaum kawg, lawv ua ke cov pab pawg neeg kev kwv yees ntawm cov nyiaj them yug nrog lub npe hu luaj li cas ntawm txhua pab pawg neeg kom tsim ib tug kwv yees li ntawm tag nrho theem ntawm kev pab txhawb nqa. Nyob rau hauv lwm yam lus, lawv tws li cov pejxeem mus rau txawv pab pawg, kwv yees tus nqi them yug rau Obama nyob rau hauv txhua pab pawg neeg, thiab ces muab ib weighted nruab nrab ntawm cov pab pawg neeg kev kwv yees rau tsim ib tug zuag qhia tag nrho kev kwv yees.
Yog li, lub loj kev sib tw nyob rau hauv lawv mus kom ze yog los laij rau tus them nyiaj yug rau Obama nyob rau hauv txhua tus ntawm cov 176.256 pawg. Txawm hais tias lawv vaj huam sib luag muaj 345.858 cim koom, ib tug loj loj ntawm cov qauv ntawm kev xaiv tsa pov npav, muaj ntau, ntau pab pawg rau cov uas Wang thiab lug txhawb cov miv muaj yuav luag tsis muaj cov neeg. Yog li ntawd, los laij rau tus pab txhawb nqa nyob rau hauv txhua pab pawg neeg lawv siv ib tug txheej txheem uas hu multilevel regression nrog post-stratification, uas soj ntsuam affectionately hu Mr. P. Yeej tseem zoo, los laij rau tus them nyiaj yug rau Obama nyob rau hauv ib cov pab pawg neeg, Mr. P. dej ntaub ntawv los ntawm ntau zoo txog pab pawg. Piv txwv li, xav txog cov kev sib tw ntawm kwv yees tus nqi them yug rau Obama cov poj niam, Hispanics, nruab nrab ntawm 18-29 xyoo, uas yog kawm ntawv qib siab tiav, uas yog sau npe Democrats, uas nws tus kheej-qhia raws li moderates, thiab leej twg pom zoo rau Obama nyob rau hauv 2008. Qhov no yog ib tug heev, heev pab pawg neeg, thiab nws yog tau hais tias muaj yog tsis muaj leej twg nyob rau hauv tus qauv nrog cov yam ntxwv. Yog li ntawd, yuav ua rau kev kwv yees txog qhov no pab pawg neeg, Mr. P. dej ua ke kwv yees los ntawm cov neeg nyob rau hauv zoo heev pawg.
Siv no tsom xam lub tswv yim, Wang thiab lug txhawb cov miv twb tau siv cov XBox uas tsis yog-yuav qhov muaj tseeb qauv mus rau zoo li zoo kwv yees lub zuag qhia tag nrho cov nyiaj them yug uas Obama tau txais nyob rau hauv lub 2012 kev xaiv tsa (Daim duab 3.5). Nyob rau hauv qhov tseeb lawv kwv yees tau ntau yog dua ib tug sau ua ib pawg ntawm cov pej xeem lub tswv yim chaw xaiv tsa. Yog li, nyob rau hauv cov ntaub ntawv no, weighting-yeej Mr. P.-nkawd ua ib txoj hauj lwm kho cov biases nyob rau hauv uas tsis yog-yuav qhov muaj tseeb cov ntaub ntawv; biases uas pom thaum koj saib cov kev kwv yees los ntawm cov unadjusted Xbox cov ntaub ntawv.
Muaj ob lub ntsiab lus ntawm txoj kev tshawb ntawm Wang thiab lug txhawb cov miv. Ua ntej, unadjusted uas tsis yog-yuav qhov muaj tseeb kuaj yuav ua tau kom phem kev kwv yees; qhov no yog ib zaj lus qhia hais tias muaj ntau soj ntsuam ntawm muaj tau hnov ua ntej. Txawm li cas los, qhov thib ob zaj lus qhia no yog hais tias uas tsis yog-yuav qhov muaj tseeb kuaj, thaum hnyav kom zoo, muaj peev xwm ua tau tsim zoo heev kwv yees. Nyob rau hauv qhov tseeb, lawv kwv yees tau ntau yog tshaj lub kev kwv yees los ntawm pollster.com, ib tug aggregation ntawm ntau tsoos xaiv tsa xaiv tsa.
Thaum kawg, muaj ib qho tseem ceeb rau cov kev txwv li cas peb yuav kawm tau los ntawm qhov no yog ib cov kev kawm. Cia li vim hais tias post-stratification ua hauj lwm zoo nyob rau hauv no xyov cov ntaub ntawv, yog tsis muaj guarantee tias nws yuav ua hauj lwm zoo nyob rau hauv lwm tus neeg mob. Nyob rau hauv qhov tseeb, kev xaiv tsa yog tej zaum ib tug ntawm cov uas yooj yim chaw vim hais tias pollsters tau kawm kev xaiv tsa rau yuav luag 100 xyoo, muaj kev tawm tswv yim (peb yuav saib leej twg yeej xwb kev xaiv tsa), thiab tog kev qhia kom paub thiab demographic yam ntxwv no kuj predictive ntawm pov npav. Thaum no tus taw tes, peb tsis muaj khoom kev tshawb xav thiab empirical kev paub thaum weighting kev kho me ntsis rau cov uas tsis-yuav qhov muaj tseeb kuaj yuav tsim sufficiently tseeb kev kwv yees. Ib yam uas yog tseeb, tiam sis, yog hais tias koj yog yuam mus ua hauj lwm nrog cov-yuav qhov muaj tseeb kuaj, ces muaj muaj zog yog vim li cas mus ntseeg tau hais tias hloov kho kev kwv yees yuav zoo dua uas tsis yog-tom kev kwv yees.