Kweli phakade analog, ukuqokelela ulwazi malunga nokuziphatha-owenza ntoni xa-eninzi, kwaye ke ngoko, ezinqabileyo kakhulu. Ngoku, ubudala yesuntswana, iindlela ezigidi abantu abhalwe, igcinwe, kwaye analyzable. Umzekelo, lonke ixesha nqakraza kwi website, ukwenza umnxeba iselfowuni yakho, okanye ukuhlawula into ikhadi lakho letyala, irekhodi yedijithali indlela yakho iyavulwa kwaye mayigcinwe ishishini. Ngenxa yokuba ezi data isinxuluma-kwimveliso yezenzo zabantu esilingene imini, badla ngokuba imizila yesuntswana. Ukongeza kwezi zinto ezigcinwe amashishini, oorhulumente kufuneka data kakhulu zizityebi malunga bobabini abantu namashishini, idatha leyo edla ekhompyutheni kwaye analyzable. Zombini ezi iirekhodi amashishini norhulumente zidla ngokuba data ezinkulu.
KoMkhukula ngonaphakade-ekuphumeni data enkulu kuthetha ukuba siye mayisuswe kwihlabathi apho data zokuziphatha kwakunqabile ukuba kwihlabathi apho data yokuziphatha kuninzi. Kodwa ke, ngenxa yokuba ezi data ntlobo sebatsha, isixa amabi uphando usebenzisa kubo elijongeka izazinzulu yoke nokusukelana idata ekhoyo. Esi sahluko, endaweni yoko, inika indlela kwimigaqo ukuqonda imithombo eyahlukeneyo data kwaye zingasetyenziswa njani. Le ukuqonda uqina kufuneka akuncede engcono imibuzo yakho uphando imithombo efanelekileyo data. Okanye, ukuba ezo imithombo esele zinqongopheleyo, akweyisele ukuqokelela idata zakho usebenzisa izimvo kwizahluko elizayo.
Inyathelo ukuba ukufunda data enkulu kuqala ukuqonda ukuba inxenye yodidi ngokubanzi data sele kusetyenziswa kuphando zentlalo iminyaka emininzi: data lokuqwalasela. Kalukhuni, data lokuqwalasela nayiphi data olubangelwa elilandela inkqubo yentlalo ngaphandle bangenelela ngandlel 'ithile. Indlela ekrwada ukucinga ngayo kukuba data lokuqwalasela yonke into alubandakanyi ukuthetha nabantu (umzekelo, uphando, isihloko kwiSahluko 3) okanye ukutshintsha bume abantu (umzekelo, iimvavanyo, isihloko Isahluko 4). Ngenxa yoko, ukongeza kwiingxelo amashishini norhulumente, iinkcukacha lokuqwalasela kwakhona kuquka izinto ezifana okubhaliweyo amanqaku amaphephandaba iifoto satellite.
Esi sahluko inamacandelo amathathu. Okokuqala, kwiCandelo 2.2, ndiya ukuchaza data enkulu ngakumbi kwaye ucacise umahluko obalulekileyo phakathi kwayo kunye ne-data ngokuqhelekileyo kusetyenziswa kuphando zentlalo ngaphambili. Emva koko, kwiCandelo 2.3, ndiya ukuchaza iimpawu ezilishumi oqhelekileyo imithombo yedatha enkulu. Ukuqonda ezi mpawu kusinceda siqonde ngokukhawuleza amandla kunye nobuthathaka kwimithombo ezikhoyo kwaye kuya kusinceda sikwazi ukuzibhaqa imithombo ezintsha eziza kudalwa kwixesha elizayo. Okokugqibela, kwiCandelo 2.4, ndiya ukuchaza izicwangciso uphando eziphambili ezintathu unako ukusebenzisa ukufunda data lokuqwalasela: izinto ukubala, izinto ngempumelelo, kwaye approximating ulingelo.