Soloko-on data ezinkulu yenza uphando iziganeko ezingalindelekanga zomlinganiselo okwixesha-lokwenene.
Iinkqubo ezininzi enkulu data zisoloko-on; basoloko ukuqokelela iinkcukacha. Olu phawu usoloko-on ibonelela abaphandi kunye data nobude (oko kukuthi, data ekuhambeni kwexesha). Ngokusoloko-on kuneziphumo ezimbini ezibalulekileyo zophando.
Okokuqala, rhoqo ukuqokelela idatha kudla ukuba abaphandi bafunde iziganeko ezingalindelekanga ngeendlela ezingenakwenzeka. Ngokomzekelo, abaphandi abanomdla ekufundeni i-Occupy Gezi imibhikisho eTurkey ehlobo le-2013 babeza kugxininisa ekuziphatheni kwababhikishi ngexesha lomcimbi. UCeren Budak noDuncan Watts (2015) bakwazi ukwenza okungakumbi ngokusebenzisa imvelo ye-Twitter ukuze bafunde ababethi abasebenzisa i-Twitter ngaphambili, ngexesha, nangemva kwesiganeko. Kwaye, banakho ukudala iqela lokuqhathaniswa labangabikho phambili phambi, ngexesha, nangemva kweso siganeko (umhlathi 2.2). Iphelele, iphaneli yabo yangaphambilini yayiquka ama-tweets abantu abangama-30,000 ngaphezu kweminyaka emibini. Ngokunyusa idatha esetyenziswa ngokuqhelekileyo kwiinkqubo zokubhikisha ngaleminye ingcaciso, uBudak noWatts bakwazi ukufunda okungakumbi: babekwazi ukuqikelela ukuba zeziphi iintlobo zabantu ababenokuthi bathathe inxaxheba kwiimbambano zeGezi kwaye baqikelele utshintsho kwindlela yokujonga ngayo abathathi-nxaxheba kunye nabangaphandle, kokubini kwithuba elifutshane (ukuthelekisa i-pre-Gezi ukuya kwi-Gezi) kunye nexesha elide (ukuthelekisa i-pre-Gezi kunye ne-Post-Gezi).
Ubungqina bokungathembeki kubonisa ukuba ezinye zezi qi kelelo zenziwe ngaphandle kokuhlala-kwimithombo yokuqokelela idatha (umz., Ukuqikelela kwexesha elide lokuguquka kwesimo sengqondo), kwaye oko kuchanekileyo, nangona ukuqokelela kwedatha kubantu abayi-30,000 kuya kuba kubiza. Nangona kunikezelwa isabelomali esingenamkhawulo, nangona kunjalo, andinakucinga ngenye indlela evumela abaphandi ukuba babuyele emva kwexesha kwaye baqwalasele ngokuthe ngqo ukuziphatha kwabadlali kwixesha elidlulileyo. Eyona ndlela esondeleyo iya kubakho ukuqokelela iingxelo ezizibukhali zokuziphatha, kodwa ezi ngxelo ziza kuba zi-granularity and limited accuracy. Itheyibhile 2.1 inikeza eminye imizekelo yezifundo ezisetyenziselwa umthombo wedatha ukufunda isiganeko esingalindelekanga.
Isiganeko esingalindelekanga | Uhlala unomthombo wedatha | Citation |
---|---|---|
Ukunyuka kweGezi e-Occupy eTurkey | Budak and Watts (2015) | |
Umbrella ibhikisha eHong Kong | Zhang (2016) | |
Ukuthunjwa kwamapolisa kwisixeko saseNew York | Iingxelo ze-Stop-and-frisk | Legewie (2016) |
Umntu ojoyina i-ISIS | Magdy, Darwish, and Weber (2016) | |
NgoSeptemba 11, 2001 | kuhlaliswa | Cohn, Mehl, and Pennebaker (2004) |
NgoSeptemba 11, 2001 | imiyalezo yepager | Back, Küfner, and Egloff (2010) , Pury (2011) , Back, Küfner, and Egloff (2011) |
Ukongezelela ekufundeni iziganeko ezingalindelekanga, rhoqo iinkqubo zedata ezinkulu zenza abaphandi bakwazi ukuvelisa ukuqikelelwa kwexesha langempela, okubalulekileyo kubasetyenziswe apho abenzi bemigaqo-karhulumente okanye ishishini-bafuna ukuphendula ngokusekelwe kwimiba yokuqonda. Umzekelo, idatha yeendaba yoluntu ingasetyenziselwa ukukhokela ngokukhawuleza kwiimvelo ntlekele (Castillo 2016) kunye nemithombo eyahlukeneyo yemithombo eninzi ingasetyenziselwa ukuvelisa ukuqikelela kwexesha langempela lomsebenzi wezoqoqosho (Choi and Varian 2012) .
Ekugqibeleni, rhoqo iinkqubo zedata zivumela abaphandi ukuba bafunde iziganeko ezingalindelekanga kwaye banikeze ulwazi lwengcaciso yangempela kumenzi wabalingisi. Akunjalo, nangona kunjalo, ndicinga ukuba iisetyenziselwano zedatha zihlala zilungele ukuguqulwa kokulandelela kwixesha elide. Kungenxa yokuba ezininzi iinkqubo ezinkulu zeenkcukacha zihlala zitshintsha-inkqubo endiyibiza ngayo emva kwesahluko (icandelo 2.3.7).