Ọzọkwa nkọwa

Ngalaba a na-iji ike mee ka a kwuru okwu ya, kama na-agụ dị ka a kọrọ.

  • Okwu Mmalite (Nkebi 2.1)

One ụdị edebe na-adịghị gụnyere isiakwụkwọ a bụ ethnography. Maka ozi on ethnography na dijitalụ oghere ahụ Boellstorff et al. (2012) , na n'ihi na ihe on ethnography na mbuaha digital na n'ụzọ anụ ahụ oghere ahụ Lane (2016) .

  • Big data (mpaghara 2.2)

Mgbe ị na-repurposing data, e nwere abụọ iche echiche aghụghọ na pụrụ inyere gị aka ịghọta na o kwere omume nsogbu, i wee na-enwe. Mbụ, i nwere ike gbalịa iche echiche na ezigbo dataset maka nsogbu gị na-atụnyere na ka dataset na ị na-eji. Olee ndị ha bụ ndị yiri na otú ka hà dị iche? Ọ bụrụ na ị na-anakọta gị data onwe gị, e nwere yiri ka ihe dị iche n'etiti ihe ị chọrọ na ihe i nwere. Ma, ị na-ekpebi ma ọ bụrụ na ndịrịta iche ndị a na-adị obere ma nke ukwu.

Nke abụọ, cheta na mmadụ kere na-anakọtara gị data na n'ihi ihe ụfọdụ. Ị kwesịrị ị na-agbalị ịghọta echiche ha na. Nke a na ụdị mgbanwe-engineering nwere ike inyere gị aka ịmata na o kwere omume nsogbu na biases gị repurposed data.

Ọ dịghị otu otutu mmadu kwenyere definition nke "nnukwu data", ma ọtụtụ nkọwa o yiri ka-elekwasị anya na 3 Vs: olu, dịgasị iche iche, na ike ọsọ (eg, Japec et al. (2015) ). Kama ilekwasị anya àgwà nke data, m definition elekwasị anya ọzọ na ihe mere data e kere.

My Nsonye nke ọchịchị administrative data n'ime Atiya nke nnukwu data bụ a bit unusually. Ndị ọzọ bụ ndị mere ka nke a, gụnyere Legewie (2015) , Connelly et al. (2016) , na Einav and Levin (2014) . N'ihi na ihe banyere uru ọchịchị administrative data maka nnyocha, gụọ Card et al. (2010) , Taskforce (2012) , na Grusky, Smeeding, and Snipp (2015) .

N'ihi na a echiche nke administrative nnyocha si n'ime ọchịchị mgbakọ na mwepụ usoro, karịsịa ndị US Census Bureau,-ahụ Jarmin and O'Hara (2016) . N'ihi na a akwụkwọ ogologo ọgwụgwọ nke administrative ndia nnyocha na Statistics Sweden, na-ahụ Wallgren and Wallgren (2007) .

Na isi, m nkenke tụnyere a omenala nnyocha e mere dị ka General Social Survey (GSS) ka a na-elekọta mmadụ media data isi iyi dị otú ahụ dị ka Twitter. N'ihi na a ọma na iji nlezianya na tụnyere n'etiti omenala nnyocha ndị e mere na-elekọta mmadụ media data, ịhụ Schober et al. (2016) .

  • Common e ji mara nnukwu data (Nkebi 2.3)

Ndị a 10 e ji mara nnukwu data e kọwara na a dịgasị iche iche nke dị iche iche site na a dịgasị iche iche nke dere ya. Odide ahu nke na-enwe mmetụta m si eche echiche banyere ihe ndị a na-agụnye: Lazer et al. (2009) , Groves (2011) , Howison, Wiggins, and Crowston (2011) , boyd and Crawford (2012) , Taylor (2013) , Mayer-Schönberger and Cukier (2013) , Golder and Macy (2014) , Ruths and Pfeffer (2014) , Tufekci (2014) , Sampson and Small (2015) , Lewis (2015) , Lazer (2015) , Horton and Tambe (2015) , Japec et al. (2015) , na Goldstone and Lupyan (2016) .

N'oge nile isiokwu a, m ji okwu digital metụtara, nke m na-eche bụ dịtụ na-anọpụ iche. Ọzọ na-ewu ewu okwu maka dijitalụ metụtara bụ digital n'akara (Golder and Macy 2014) , ma dị ka Hal Abelson, Ken Ledeen, na Harry Lewis (2008) na-ekwu, a ọzọ kwesịrị ekwesị okwu bụ ma eleghị anya digital fingerprints. Mgbe ị mepụtara n'akara, ị maara ihe na-eme na gị n'akara ike n'ozuzu deere ndị a. Otu abụghị eziokwu n'ihi na gị dijitalụ metụtara. N'eziokwu, na ị na-ahapụ metụtara niile banyere nke i nwere nnọọ obere ihe ọmụma. Na, ọ bụ ezie na ndị a metụtara enweghị aha gị na ha, ha pụrụ mgbe mgbe ike jikọrọ azụ gị. Ya bu, ha na-aka dị ka fingerprints:-adịghị ahụ anya na ya onwe-akọwapụta.

Big

N'ihi na ihe mere nnukwu datasets, ijere mgbakọ na mwepụ ule mfịna,-ahụ Lin, Lucas, and Shmueli (2013) na McFarland and McFarland (2015) . Ndị a okwu kwesịrị iduga na-eme nnyocha na-eche banyere bara uru kama mgbakọ na mwepụ uru.

Always-on

Mgbe atụle mgbe niile-on data, ọ dị mkpa ịtụle ma ị na-atụnyere kpọmkwem otu ihe ahụ ndị mmadụ karịrị oge ma ọ bụ ma ị na-atụnyere ụfọdụ na-agbanwe agbanwe nke ìgwè mmadụ; -ahụ ihe atụ, Diaz et al. (2016) .

Non-reactive

A kpochapụwo akwụkwọ na-abụghị ndị reactive jikoro bụ Webb et al. (1966) . The atụ n'akwụkwọ tupu-ụbọchị dijitalụ afọ, ma ha ka na-enye ìhè. N'ihi na ihe atụ nke ndị na-agbanwe àgwà ha n'ihi na nke ọnụnọ nke uka onyunyo, na-ahụ Penney (2016) na Brayne (2014) .

ezughị ezu

Maka ozi on ndekọ linkage,-ahụ Dunn (1946) na Fellegi and Sunter (1969) (akụkọ ihe mere eme) na Larsen and Winkler (2014) (oge a). Yiri bịakwutere nakwa na e mepụtara na kọmputa sayensị n'okpuru aha dị ka data deduplication, atụ njirimara, aha kenha, oyiri nchọpụta, na oyiri ndekọ nchọpụta (Elmagarmid, Ipeirotis, and Verykios 2007) . E nwekwara nzuzo echebekwa-eru nso ka ịdekọ linkage nke-anaghị achọ nnyefe nke onwe-akọwapụta ọmụma (Schnell 2013) . Facebook na-ewepụtala n'ihu ejikọta ha na ndekọ na-ịtụ vootu omume; a ka e mere iji inwale nnwale na m agwa gị banyere n'Isi nke 4 (Bond et al. 2012; Jones et al. 2013) .

Maka ozi on mmepụta ya ndaba,-ahụ Shadish, Cook, and Campbell (2001) , n'Isi nke 3.

keerughi

N'ihi na ihe na AOL search log debacle,-ahụ Ohm (2010) . M na-enye ndụmọdụ banyere partnering na ụlọ ọrụ ndị na ọchịchị n'Isi nke 4 mgbe m na-akọwa nwere. A ọnụ ọgụgụ nke ndị edemede bụ kwuputela nchegbu banyere nnyocha na-adabere ná keerughi data, ịhụ Huberman (2012) na boyd and Crawford (2012) .

One ezi ụzọ na mahadum na-eme nnyocha iji nweta data ohere bụ na-arụ ọrụ a ụlọ ọrụ dị ka onye oru ma ọ bụ na ịga na-eme nchọpụta. Na mgbakwunye na-eme data ohere, nke a ga-enyekwara ndị na-eme nchọpụta mụta otú ahụ data e kere, nke dị mkpa maka analysis.

Non-anọchite anya

Non-representativeness bụ nsogbu bụ isi maka nnyocha na ndị ọchịchị ndị chọrọ ime okwu banyere ihe dum bi. Nke a bụ obere nchegbu o nwere maka ụlọ ọrụ ndị na-Elezie-lekwasịrị anya ha ọrụ. N'ihi na ihe na-esi Statistics Netherlands weere ihe iseokwu nke na-abụghị representativeness nke azụmahịa nnukwu data, ịhụ Buelens et al. (2014) .

N'Isi nke 3, Aga m na-akọwa ụfọdụ na ziekwa na nnọọ ukwuu zuru ezu. Ọbụna ma ọ bụrụ na data na-abụghị ndị nnọchiteanya, n'okpuru ụfọdụ ọnọdụ, ha nwere ike dara nha iji na-emepụta mma atụmatụ e mere.

ịkpafu

System nwayọọ siri nnọọ ike ịhụ site n'èzí. Otú ọ dị, MovieLens oru ngo (tụlere ihe n'Isi nke 4) e na-agba ọsọ n'ihi na ihe karịrị afọ 15 site na otu agụmakwụkwọ nnyocha òtù. Ya mere, ha akwukwo ma na-akọrọ ọmụma banyere otú di na usoro emewo, ghọrọ usoro ihe karịrị oge na otú a pụrụ imetụta analysis (Harper and Konstan 2015) .

A ọnụ ọgụgụ nke ndị ọkà mmụta na-elekwasị anya na nwayọọ na Twitter: Liu, Kliman-Silver, and Mislove (2014) na Tufekci (2014) .

Algorithmically ihere

Mbụ m nụrụ okwu ahụ bụ "algorithmically ogbara ghari" na-eji Jon Kleinberg na a okwu. Isi echiche n'azụ performativity bụ na ụfọdụ na-elekọta mmadụ na sayensị chepụtara bụ "engines bụghị ese foto" (Mackenzie 2008) . Ya bụ, na ha n'ezie enwe mmetụta ụwa kama dị nnọọ weghara ya.

unyi

Ọchịchị mgbakọ na mwepụ ụlọ ọrụ na-akpọ data ihicha, mgbakọ na mwepụ data edezi. De Waal, Puts, and Daas (2014) na-akọwa mgbakọ na mwepụ data edezi usoro mepụtara maka nnyocha e mere data na inyocha na nke ruo n'ókè ha na ọdabara ka nnukwu data isi mmalite, na Puts, Daas, and Waal (2015) ọnọde ụfọdụ nke otu echiche maka a ọzọ n'ozuzu na-ege ntị.

N'ihi na ihe atụ ụfọdụ nke ọmụmụ lekwasịrị anya spam na Twitter, Clark et al. (2016) na Chu et al. (2012) . N'ikpeazụ, Subrahmanian et al. (2016) na-akọwa na ihe ndị DARPA Twitter bot Challenge.

enwe mmetụta ọsọ ọsọ

Ohm (2015) reviews na mbụ nnyocha na echiche nke mwute ozi na-enye a multi-akpata ule. The anọ ihe ọ chọrọ ịlụ na-: gbasara nke puru omume ojoo; gbasara nke puru omume ojoo; ọnụnọ nke a nzuzo mmekọrịta; nakwa ma ihe ize ndụ na-egosi majoritarian nchegbu.

  • Agụta ihe (Nkebi 2.4.1)

Farber si amụ tagzi ahụ nọ na New York dabeere mbụ nnyocha Camerer et al. (1997) na-eji atọ dị iche iche na mma samples akwụkwọ njem Ibé akwụkwọ-akwụkwọ iche-iche eji na ọkwọ ụgbọala idekọ njem mmalite oge, ọgwụgwụ oge, na ụgbọ. Nke a na mbụ Nchoputa na gosiri na ọkwọ ụgbọala nke yiri ka ọ iche earners: ha na-arụ ọrụ na-erughị on ụbọchị ebe ha ụgwọ ọrụ ha nọ elu.

Kossinets and Watts (2009) e lekwasịrị anya si malite nke homophily na mmadụ netwọk. Lee Wimmer and Lewis (2010) n'ihi na a dị iche iche obibia otu nsogbu nke na-eji data site na Facebook.

Ụdi ọrụ, King na ndị ọrụ ibe ha n'ihu enyoba online nnyocha na China (King, Pan, and Roberts 2014; King, Pan, and Roberts 2016) . N'ihi na a yiri obibia ji atụ online nnyocha na China,-ahụ Bamman, O'Connor, and Smith (2012) . Maka ozi on mgbakọ na mwepụ usoro dị ka onye na-eji na King, Pan, and Roberts (2013) na-eme atụmatụ ihe ahụ 11 nde posts,-ahụ Hopkins and King (2010) . Maka ozi on chịkwara mmụta,-ahụ James et al. (2013) (obere oru) na Hastie, Tibshirani, and Friedman (2009) (ihe oru).

  • Ịkọ (Nkebi 2.4.2)

Ịkọ bụ nnukwu akụkụ nke ulo oru data sayensị (Mayer-Schönberger and Cukier 2013; Provost and Fawcett 2013) . Otu ụdị ịkọ na ọtụtụ ndị na-eme site na-elekọta mmadụ na-eme nnyocha na-omume igwe mmadụ ịkọ ihe atụ Raftery et al. (2012) .

Google Flu Trends abụghị mgbe mbụ oru ngo iji search data ka nowcast influenza njupụta. N'ezie, nnyocha na United States (Polgreen et al. 2008; Ginsberg et al. 2009) na Sweden (Hulth, Rydevik, and Linde 2009) achọpụtala na ụfọdụ na ọchụchọ (eg, "flu") buru amụma na mba ahụ ike ọha surveillance data tupu ya a tọhapụrụ ya. Mgbe nke ahụ gasịrị ọtụtụ, ọtụtụ ndị ọzọ na oru ngo na-agbalị iji dijitalụ Chọpụta data maka ọrịa surveillance nchọpụta,-ahụ Althouse et al. (2015) n'ihi na a nyochaa.

Na mgbakwunye na-eji dijitalụ Chọpụta data ịkọ ahụ ike ịka, e Umuihe a Akwa utom eji Twitter data ịkọ ntuli aka a ga esi; n'ihi na reviews-ahụ Gayo-Avello (2011) , Gayo-Avello (2013) , Jungherr (2015) (Ch. 7), na Huberty (2015) .

Iji search data ka ịkọ influenza jupụta ebe nile nakwa iji Twitter data ịkọ ntuli aka na-ma ihe atụ nke na-eji ụdị ụfọdụ nke dijitalụ Chọpụta ịkọ ụdị ụfọdụ nke ihe omume ndị nọ n'ụwa. E buru ibu ọnụ ọgụgụ nke ọmụmụ na nwere a n'ozuzu Ọdịdị. Isiokwu 2.5 agụnye a ole na ole ihe atụ ndị ọzọ.

Isiokwu 2.5: Ele Mmadụ Anya n'Ihu ndepụta nke ọmụmụ iji ụfọdụ digital Chọpụta ịkọ ụfọdụ omume.
Digital Chọpụta pụta tikeeti
Twitter Igbe ọrụ revenue nke fim na US Asur and Huberman (2010)
search ndekọ Sales nke fim, music, akwụkwọ, na egwuregwu vidio na US Goel et al. (2010)
Twitter Dow Jones Industrial Nkezi (US ngwaahịa ahịa) Bollen, Mao, and Zeng (2011)
  • Approximating nwere (Nkebi 2.4.3)

Magazin PS Political Science nwere a na ogbako mkparita uka on nnukwu data, causal inference, na iwu tiori, na Clark and Golder (2015) achịkọta ọ bụla onyinye. Magazin Proceedings of the National Academy of Sciences nke United States of America nwere a na ogbako mkparita uka on causal inference na nnukwu data, na Shiffrin (2016) achịkọta ọ bụla onyinye.

Na okwu nke eke nwere, Dunning (2012) kacha onye akwụkwọ ogologo ọgwụgwọ. Maka ozi on iji Vietnam draft lọtrị dị ka a eke nnwale, na-ahụ Berinsky and Chatfield (2015) . N'ihi na igwe mmụta na-eru nso na-anwa-akpaghị aka na-achọpụta eke nwere n'ime nnukwu data isi mmalite, na-ahụ Jensen et al. (2008) na Sharma, Hofman, and Watts (2015) .

Na okwu nke kenha, n'ihi na onye nwere nchekwube nyochaa,-ahụ Stuart (2010) , na n'ihi na a pessimistic nyochaa ahụ Sekhon (2009) . Maka ozi on kenha dị ka ụdị kwachaa,-ahụ Ho et al. (2007) . N'ihi na akwụkwọ na-enye ndị magburu onwe agwọ ọrịa nke kenha,-ahụ Rosenbaum (2002) , Rosenbaum (2009) , Morgan and Winship (2014) , na Imbens and Rubin (2015) .