Ko da yake da shi zai iya zama m, wadãtar da roƙa iya zama m.
A daban-daban tsarin kula da tafiyad da incompleteness na digital alama data ne to wadãtar da shi kai tsaye tare da binciken data, wani tsari wanda Zan kira wadãtar roƙa. Daya misali na wadãtar roƙa ne nazarin Burke and Kraut (2014) , wanda na bayyana a baya a cikin sura ta (Sashe 3.2), game da ko yin hulda on Facebook qara aminci ƙarfi. A wannan yanayin, Burke kuma Kraut hade binciken data da Facebook log data.
The wuri cewa Burke kuma Kraut suna aiki a, duk da haka, nufin cewa ba su da magance babban biyu matsaloli da masu bincike yin wadãtar tambayar fuskarsa. Na farko, a zahiri cudanya tare da bayanai sets-a tsari da ake kira rikodin hada huldodi, da matching wani rikodin a daya dataset da ya dace rikodin a cikin wasu dataset-iya zama da wahala da kuma kuskure-yiwuwa (za mu ga wani misali da wannan matsala a kasa ). Na biyu main matsalar wadãtar roƙa shi ne cewa ingancin da digital burbushi za akai-akai zama da wuya ga masu bincike don tantance. Alal misali, wani lokaci da tsari ta hanyar abin da aka tattara ne mallakar tajirai kuma zai iya zama mai saukin kamuwa ga mutane da yawa daga cikin matsalolin da aka bayyana a Babi na 2. A wasu kalmomin, wadãtar da roƙa za akai-akai unsa kuskure-yiwuwa dake tsakanin safiyo ga baki-akwatin data samo unknown quality. Duk da damuwa da cewa wadannan biyu matsaloli gabatar, yana yiwuwa su gudanar da muhimmanci bincike tare da wannan dabarun kamar yadda aka nuna da Stephen Ansolabehere da Etan Hersh (2012) a cikin bincike kan zabe alamu a Amurka. Yana da worthwhile ka je a kan wannan binciken a wasu daki-daki domin mutane da yawa daga cikin dabarun da Ansolabehere da Hersh raya zai zama da amfani a wasu aikace-aikace na wadãtar roƙa.
Masu jefa} uri'a fito ya batun m bincike a fannin kimiyyar siyasa, da kuma a baya, masu bincike 'fahimtar wa en kuma me ya sa ya kullum aka dogara ne a kan bincike na binciken bayanai. Zaben a Amurka, duk da haka, shi ne wani sabon abu a cikin hali cewa gwamnati records ko kowane dan kasa ya zabe (ba shakka, ba a ba gwamnati rikodin wa kowane dan kasa kuri'un). Domin shekaru masu yawa, wadannan gwamnati zabe records kasance samuwa a kan takarda siffofin, warwatse a daban-daban na gida ofisoshin gwamnati a kasar. Wannan ya sa da wuya, amma ba zai yiwu ba, domin siyasa masana kimiyya a yi cikakken hoto na za ~ e da kuma kwatanta abin da mutane suka ce a safiyo game da 'yancin kada kuri'a ga ainihin zabe hali (Ansolabehere and Hersh 2012) .
Amma, yanzu wadannan rumfunan zaɓen records an digitized, kuma da dama masu zaman kansu da kamfanoni sun tsare tattara, da kuma merged wadannan zabe records, don samar da cikakken master zabe files cewa rikodin zabe hali na dukan Amirkawa. Ansolabehere da Hersh gwiwa da daya daga cikin wadannan kamfanoni-Catalist LCC-domin ya yi amfani da master zabe fayil don taimakawa wajen samar da mafi hoto na za ~ e. Bugu da ari, domin shi ya dogara digital records tattara, da kuma curated da wani kamfani, shi miƙa da dama abũbuwan amfãni a kan m kokarin da masu bincike da aka yi ba tare da taimakon kamfanonin da kuma yin amfani analog records.
Kamar da yawa daga cikin digital alama kafofin a Babi na 2, da Catalist master file ba sun hada da da yawa daga cikin alƙaluma, halaye, da kuma halayya da bayanan da Ansolabehere da Hersh da ake bukata. Baya ga wannan bayani, Ansolabehere da Hersh kasance musamman sha'awar gwada ruwaito zabe hali to inganta zabe hali (ie, da bayani a cikin Catalist database). Saboda haka, da masu bincike tattara bayanan da suke so a matsayin wani ɓangare na m majalisa Zaben Nazarin (CCES), a manyan kafofin binciken. Next, da masu bincike ya ba da wannan labari a Catalist, kuma Catalist ya ba da masu bincike baya a merged data fayil da hada inganta zabe hali (daga Catalist), da kai-ruwaito zabe hali (daga CCES) da kuma demographics da halaye na weights (daga CCES ). A wasu kalmomin, Ansolabehere da Hersh wadãtar da zabe data tare da binciken bayanai, da kuma sakamakonsa na merged file sa su su yi wani abu da ba file sa akayi daban-daban.
By enriching da Catalist master data fayil da binciken data, Ansolabehere da Hersh zo zuwa uku da muhimmanci karshe. Na farko, a kan-rahoto na zabe ne sha musantawa: kusan rabin masu jefa} uri'a ba ruwaito zabe. Ko kuma, wani hanyar kallon shi ne idan wani ya ruwaito zabe, akwai kawai wani 80% damar cewa su zahiri zabe. Na biyu, a kan-rahoto ba bazuwar. kan-rahoto ne mafi kowa daga high-samun kudin shiga, da-ilimi, ku masõyansa suka tsunduma a harkokin jama'a. A wasu kalmomin, da mutanen da suke da mafi kusantar su zabe ma mafi kusantar su kwanta kamar zabe. Na uku, kuma mafi kafofin yada, saboda din yanayin kan-rahoto, da ainihin bambance-bambance tsakanin masu kada kuri'a da kuma wadanda ba masu jefa ƙuri'a ne karami fiye da suka bayyana kawai daga safiyo. Alal misali, waɗanda suke tãre da wani gwagwaren mataki ne game da 22 kashi maki mafi kusantar su bayar da rahoton zabe, amma su ne kawai 10 kashi maki mafi kusantar su ainihin zaben. Bugu da ari, data kasance hanya tushen theories na zabe ne mafi alhẽri a tsinkaya wanda zai bayar da rahoton zabe fiye da wanda zahiri kuri'u, an empirical binciken da kira ga sabon theories fahimta da kuma hango ko hasashen zabe.
Amma, nawa ya kamata mu yarda da waɗannan results? Ka tuna wadannan sakamakon dogara ne a kan kuskure-yiwuwa dake tsakanin baki-akwatin data tare da ba a sani ba yawa na kuskure. More musamman, sakamakon hinjis biyu key matakai: 1) da ikon na Catalist hada yawa disparate data kafofin samar da wani cikakken master datafile na 2 kuma) ikon da Catalist danganta da binciken bayanai zuwa da master datafile. Kowace daga cikin wadannan matakai ne quite wuya da kuma kurakurai a ko dai mataki zai kai masu bincike da ba daidai ba karshe. Duk da haka, duka biyu data aiki da kuma daidai da suke da muhimmanci ga ci gaba da wanzuwar Catalist matsayin kamfanin don haka shi za a iya zuba jari albarkatun a warware wadannan matsaloli, sau da yawa a sikelin cewa babu mutum ilimi bincike ko rukuni na masu bincike za a iya daidaita. A cikin kara karatu a karshen sura, na bayyana wadannan matsaloli a more daki-daki da kuma yadda Ansolabehere da Hersh gina amincewa a results. Ko da yake wadannan details ne musamman don wannan binciken, al'amurran da suka shafi irin wadannan za su bayyana ga sauran masu bincike fata danganta ga baki-akwatin digital alama data kafofin.
Mene ne general darussa masu bincike za su iya zana daga wannan binciken? Na farko, akwai gagarumin darajar daga enriching digital burbushi da binciken bayanai. Na biyu, ko da shike waɗannan aggregated, kasuwanci data kafofin ya kamata ba a yi la'akari "ƙasa gaskiya", a wasu lokuta da za su iya zama da amfani. A gaskiya, shi ne mafi kyau ga kwatanta wadannan data kafofin ba cikakkar gaskiya (daga abin da suke ko da yaushe fada takaice). Maimakon haka, shi ne mafi alhẽri kwatanta su da sauran samuwa data kafofin, wanda kokari ne da kurakurai da.