[ , ] Algorithmic rikitawa shi ne matsala tare da Google Flu Trends. Karanta takarda ta Lazer et al. (2014) , kuma rubuta wani ɗan gajeren gajere, share imel ga injiniya a Google yana bayanin matsalar kuma ya ba da ra'ayin yadda za'a gyara shi.
[ ] Bollen, Mao, and Zeng (2011) iƙirarin cewa bayanai daga Twitter za a iya amfani dasu don hango nasu kasuwa. Wannan binciken ya haifar da kafa asusun shinge-Derwent Capital Markets-don zuba jarurruka a kasuwar jari bisa bayanan da aka tattara daga Twitter (Jordan 2010) . Wadanne shaidu za ku so ku gani kafin ku saka kuɗin a cikin asusu?
[ ] Yayin da wasu masu bayar da agajin lafiyar jama'a suka yi la'akari da cigaban e-cigaba don taimakawa wajen shan taba, wasu sunyi gargadin yiwuwar hadarin, irin su matakan nicotine. Ka yi tunanin cewa wani mai bincike ya yanke shawarar nazarin ra'ayin jama'a game da taba-sigari ta hanyar tattara takaddun shafukan yanar-gizon e-cigare da gudanar da zane-zane.
[ ] A watan Nuwamba 2009, Twitter ta canza tambayar a cikin akwatin zane daga "Me kuke yin?" Zuwa "Me ke faruwa?" (Https://blog.twitter.com/2009/whats-happening).
[ ] "Saukewa" ana amfani da su don auna tasiri da kuma yada tasiri akan Twitter. Da farko, masu amfani sun kwafa da kuma manna tweet da suka so, zana mawallafi na ainihi tare da mahimmanta, kuma da hannu ta rubuta "RT" a gaban tweet don nuna cewa yana da wani retweet. Sa'an nan kuma, a 2009, Twitter ya kara maɓallin "retweet". A watan Yuni 2016, Twitter ya sa masu amfani su sake gwada tweets a kansu (https://twitter.com/twitter/status/742749353689780224). Kuna tsammanin wadannan canje-canje zasu shafi yadda kuke amfani da "retweets" a cikin bincikenku? Me ya sa ko me yasa ba?
[ , , , ] A cikin takarda da aka tattauna, Michel da abokan aiki (2011) bincikar abubuwan da ke dauke da littattafai fiye da miliyan biyar a cikin ƙoƙari na gano al'amuran al'adu na dogon lokaci. Bayanan da aka yi amfani dasu yanzu an sake sakin su kamar yadda Google NGrams dataset ya samu, don haka za mu iya amfani da bayanan don sake bugawa da kuma mika wasu ayyukan su.
A cikin ɗaya daga cikin sakamakon da yawa a cikin takardun, Michel da abokan aiki sun jaddada cewa muna manta da sauri da sauri. A wani shekara, ka ce "1883," sun ƙididdige yawan adadin 1-grams da aka buga a kowace shekara tsakanin 1875 zuwa 1975 wanda "1883". Sun yi tunani cewa wannan karfin ne ma'auni na sha'awar abubuwan da suka faru a wannan shekarar. A cikin siffar su 3a, sun yi mãkircin fasalin da ake amfani dasu a shekaru uku: 1883, 1910, da 1950. Wadannan shekaru uku suna raba wani abu na yau da kullum: kadan amfani kafin wannan shekarar, to, kuzari, sa'an nan kuma lalata. Bayan haka, don tantance yawan lalacewa a kowace shekara, Michel da abokan aiki sun lasafta "rabin rabi" na kowace shekara domin dukan shekaru tsakanin 1875 zuwa 1975. A cikin adadi na 3a (sun kasance), sun nuna cewa rabi na kowane shekara yana ragewa, kuma suna jaddada cewa wannan yana nufin cewa muna manta da baya da sauri da sauri. Sun yi amfani da Amfani na 1 na harshen Turanci, amma daga bisani Google ya sake fitowa ta biyu na corpus. Da fatan a karanta dukkan bangarorin tambayoyin kafin ka fara zangon.
Wannan aikin zai ba ka yin aiki da rubutu rubutun da za a iya sake sauyawa, sakamakon fassara, da kuma rikicewar bayanai (kamar aiki tare da fayiloli mara kyau da kuma magance bayanai bace). Wannan aikin zai taimaka maka ka tashi da gudu tare da dataset mai arziki da ban sha'awa.
Samun bayanai masu dacewa daga shafin yanar gizon Google Books NGram Viewer. Musamman ma, ya kamata ka yi amfani da sashi na 2 na harshen Turanci, wanda aka saki a ranar 1 ga Yuli, 2012. Ba tare da kariya ba, wannan fayil ɗin shine 1.4GB.
Sauke babban sashi na 3a na Michel et al. (2011) . Don sake kwatanta wannan adadi, zaku buƙaci fayiloli guda biyu: wanda kuka sauke a sashi na (a) da "jimlar kuɗi", wanda zaka iya amfani da su don sake juyawa takaddun shaida. Lura cewa tsarin ƙidayar ƙidayar yana da tsarin da zai sa ya zama da wuya a karanta a ciki. Shin fassarar 2 na NGram data samar da irin wannan sakamako ga waɗanda aka gabatar a Michel et al. (2011) , wanda ke dogara ne akan bayanin 1?
Yanzu duba shafukanku akan hoton da NGram Viewer ya tsara.
Recreate adadi 3a (ainihin hoto), amma canza da \(y\) -axis don zama ƙididdigar ƙididdigar ƙididdiga (ba a san yadda ake magana ba).
Shin bambanci tsakanin (b) da (d) kai ka sake sake duba duk wani sakamakon da Michel et al. (2011). Me ya sa ko me yasa ba?
Yanzu, ta yin amfani da raƙuman labaran, zakuyi rubutun adadi na 3a. Wato, a kowace shekara tsakanin 1875 zuwa 1975, lissafin rabin rabin shekarar wannan shekara. Rabin rabi an ƙayyade ya zama yawan shekarun da suka wuce kafin sassaucin ra'ayi ya kai rabin ƙimarsa. Lura cewa Michel et al. (2011) yi wani abu da yafi rikitarwa don kimanta rabi na rabi na III.6 na Rahoton Intanet na Taimako-amma suna da'awar cewa duk hanyoyin biyu suna samar da sakamako irin wannan. Shin fassarar 2 na NGram data samar da irin wannan sakamako ga waɗanda aka gabatar a Michel et al. (2011) , wanda ke dogara ne akan bayanin 1? (Shawarwari: Kada ka yi mamaki idan ba haka ba.)
Shin akwai wasu shekarun da suka wuce kamar shekaru da aka manta musamman da sauri ko musamman sannu a hankali? Yi bayani a taƙaice game da dalilan da za a iya amfani da shi don bayyana irin yadda ka gano wadanda suka fito.
Yanzu sabunta wannan sakamakon don version 2 na NGrams bayanai a cikin Sinanci, Faransanci, Jamusanci, Ibrananci, Italiyanci, Rasha da Mutanen Espanya.
Idan muka kwatanta a ko'ina cikin harsuna, akwai wasu shekarun da suka wuce, irin su shekaru da aka manta musamman da sauri ko musamman sannu a hankali? Yi bayani a taƙaice game da dalilan da za a iya yin hakan.
[ , , , ] Penney (2016) binciko ko yada labarai game da kulawar NSA / PRISM (watau wahalar Snowden) a watan Yuni 2013 an hade da haɓakaccen kwatsam a hanzarin zuwa hanyoyin Wikipedia a kan batutuwa da suke tayar da damuwa na sirri. Idan haka ne, wannan canji a hali zai kasance daidai da sakamako mai lalacewa daga sakamakon taro. Penney (2016) wani lokaci ana kiran sa da zubar da zane, kuma yana da alaƙa da hanyoyin da aka bayyana a sashe na 2.4.3.
Don zaɓar kalmomin mahimman kalmomi, Penney yayi magana akan jerin da Ma'aikatar Tsaro na Gida ta Amurka ta yi amfani da su don biyan sa ido da kuma kula da kafofin watsa labarun. Jerin DHS ya rarraba wasu sharuddan bincike a cikin wasu al'amurra, watau, "Lafiya," "Tsaro Harkokin Tsaro," da "Ta'addanci." Ga ƙungiyar binciken, Penney yayi amfani da kalmomin 48 da suka danganci "Ta'addanci" (duba shafi na 8 ). Ya kuma hada gwargwadon shafi na Wikipedia a kowane wata don abubuwan da suka dace 48 Wikipedia a cikin wata 32, daga farkon Janairu 2012 zuwa karshen watan Agustan 2014. Don ƙarfafa gardamarsa, ya kuma kirkiro ƙungiyoyi masu yawa ta hanyar bin sa ra'ayoyi game da wasu batutuwa.
Yanzu, za ku sake yin jima'i da mika Penney (2016) . Dukan bayanan da za ku buƙaci don wannan aikin yana samuwa daga Wikipedia. Ko zaka iya samun shi daga R-kunshin wikipediatrend (Meissner and R Core Team 2016) . Lokacin da ka rubuta amsoshinka, don Allah a lura abin da tushen bayanan da kuka yi amfani dasu. (Lura cewa wannan aikin yana bayyana a babi na 6.) Wannan aikin zai ba ka yin aiki a cikin jayayya na bayanai da tunani game da gwaje-gwaje na halitta a cikin manyan bayanai. Har ila yau zai sa ku sama da gudu tare da tushen bayanan mai ban sha'awa don ayyukan gaba.
[ ] Efrati (2016) ruwaito, bisa ga bayanin sirri, cewa "raba baki" akan Facebook ya ki yarda da kimanin shekara 5.5% a kowace shekara, yayin da "rahotannin watsa shirye-shirye na asali" ya karu da kashi 21% a shekara. Wannan rushewa ya kasance mai ban sha'awa tare da masu amfani da Facebook a kasa da shekaru 30. Rahoton ya danganta da ragu zuwa abubuwa biyu. Daya shine girma cikin adadin "abokai" mutane suna kan Facebook. Sauran ita ce, wasu ayyukan rabawa sun sauya zuwa saƙonni da masu gasa kamar Snapchat. Rahoton ya kuma bayyana dabarun da Facebook ta yi ƙoƙari don bunkasa rabawa, ciki har da Turawalin algorithm na News wanda ke sanya ginshiƙan asali mafi mahimmanci, da kuma tunatarwa na lokaci na asali da siffar "A wannan rana". Menene abubuwa, idan wani ya yi, waɗannan binciken ne na masu bincike da suke so su yi amfani da Facebook a matsayin tushen bayanai?
[ ] Menene bambanci tsakanin masanin ilimin zamantakewa da kuma tarihi? A cewar Goldthorpe (1991) , babban bambanci shine iko akan tattara bayanai. Ana tilasta masu yin tarihi su yi amfani da relics, yayin da masu ilimin zamantakewa na iya kirkiro tattara bayanai ga wasu dalilai. Karanta Goldthorpe (1991) . Yaya bambanci tsakanin ilimin zamantakewar al'umma da tarihin ya shafi manufar custommades da readymades?
[ ] Wannan yana gina kan tambayoyin da suka gabata. Goldthorpe (1991) ya jawo hanyoyi masu mahimmanci, ciki harda wanda daga Nicky Hart (1994) wanda ya kalubalanci yin sujada ga Goldthorpe don yin amfani da bayanai. Don bayyana mahimmancin yiwuwar bayanan da aka yi, Hart ya bayyana Ma'aikatar Harkokin Kasuwanci, babban bincike don auna dangantakar tsakanin zamantakewa da kuma ra'ayin da Goldthorpe da abokan aiki suka gudanar a tsakiyar shekarun 1960. Kamar yadda mutum zai iya tsammanin wani malamin wanda ya fi so ya tsara bayanan da aka samu bayanai, Cibiyar Harkokin Kasuwanci ta tattara bayanai da aka tsara domin magance wata ka'idar da aka tsara kwanan nan game da makomar zamantakewar al'umma a wani lokaci na cigaba da rayuwa. Amma, Goldthorpe da abokan aiki sun "manta" ko ta yaya sun "tattara" bayanai game da yadda za a gudanar da za ~ e na mata. Ga yadda Nicky Hart (1994) taƙaita dukkanin labarin:
"... yana da matukar wuya a guje wa ƙaddamarwa cewa an kawar da mata saboda wannan" mai fasahar da aka rubuta "dataset an tsare shi ta hanyar tunani mai zurfi wadda ba ta da masaniyar mata. Gwadawa ta hanyar hangen nesa game da ilimin ajiyar yara da kuma aiki a matsayin kulawa da maza ..., Goldthorpe da abokan aikinsa sun gina wata hujja ta tabbatar da abin da suke ciyarwa da kuma inganta tunanin kansu maimakon magance su ga gwaji mai kyau. "
Hart ya ci gaba:
"Sakamakon binciken da ya shafi ma'aikata na ma'aikata ya gaya mana game da dabi'un masculinist na zamantakewar zamantakewar karni na karni fiye da yadda suke sanar da hanyoyin tafiyar da zamantakewa, siyasa da rayuwa."
Kuna iya yin la'akari da wasu misalai inda karbar tattara bayanan da ke tattare da mai tattara bayanai ya gina cikin shi? Yaya wannan ya kwatanta da algorithmic rikitawa? Mene ne wannan zai iya faruwa a lokacin da masu bincike su yi amfani da shirye-shirye da kuma lokacin da suke amfani da custommades?
[ ] A cikin wannan babi, na bambanta bayanan da masu bincike suka tattara don masu bincike tare da bayanan kulawa da kamfanoni da gwamnatoci suka kirkiri. Wasu mutane suna kira wadannan bayanan kulawa "sun sami bayanai," wanda suke bambanta da "tsara bayanai." Gaskiya ne cewa masu bincike sun samo asali na tarihin gudanarwa, amma an tsara su sosai. Alal misali, kamfanonin fasaha na yau da kullum suna aiki da wuya don tattarawa da kuma magance bayanai. Sabili da haka, waɗannan kayan tarihi sun samo su kuma an tsara su, shi ne kawai ya dogara da hangen nesa (siffa 2.12).
Samar da misali na tushen bayanai inda ganin shi duka kamar yadda aka samo kuma aka tsara yana taimakawa lokacin amfani da wannan maɓallin bayanan don bincike.
[ ] A cikin wata matsala mai zurfi, Kirista Sandvig da Eszter Hargittai (2015) raba binciken digiri a cikin manyan fannoni guda biyu dangane da ko tsarin dijital ya zama "kayan aiki" ko "abu na binciken." Misali na farko irin-inda tsarin shine wani kayan aiki-bincike ne da Bengtsson da abokan aiki (2011) akan yin amfani da wayar tarhon tafi-da-gidanka don biye da tafiye-tafiye bayan girgizar kasa a Haiti a 2010. Wani misali na nau'i na biyu-inda tsarin shine abu ne na nazarin-bincike ne na Jensen (2007) game da yadda aka gabatar da wayoyin tafi-da-gidanka a ko'ina cikin Kerala, Indiya ta rinjayi aikin kasuwancin ga kifi. Ina ganin wannan bambanci yana taimakawa domin ya bayyana cewa nazarin yin amfani da asusun bayanai na digital zai iya samun matakai daban-daban ko da suna amfani da irin wannan tushen bayanai. Don ƙarin bayani game da wannan bambanci, bayyana binciken hudu da ka gani: biyu da suke amfani da tsarin dijital azaman kayan aiki da biyu waɗanda suke amfani da tsarin dijital azaman abin binciken. Zaka iya amfani da misalai daga wannan babi idan kana so.