4 Azure Data Lake Storage

published book

This chapter covers

  • Setting up a Data Lake store
  • Configuring file access in Data Lake Storage
  • Understanding and planning for data drift

In the last chapter, you learned how to work with a fundamental Azure service, the Storage account. Storage accounts provide nearly unlimited storage for many Azure services, with high throughput and high redundancy. Storage accounts also host other file-based services, such as file shares and queues.

In this chapter, you’ll learn about another storage service, Azure Data Lake Storage (ADLS). You’ll create a Data Lake store and learn how to structure your data lake to increase maintainability and security. You’ll learn how this service supports other Azure services through Azure Active Directory authentication. This will be the central service around which you construct the analytics system.

ADLS resembles a local file system, with folders and files. Azure Active Directory (AAD) controls access to folders and files, with assignable read/write/execute permissions. ADLS provides the primary storage backbone for the master data set, a source of data for batch layer processing. ADLS also stores batch analysis artifacts, including the report files that make up the output of the Serving layer (see figure 4.1).

Figure 4.1 Lambda architecture with Azure PaaS services
Livebook feature - Free preview
In livebook, text is yatplciqd in books you do not own, but our free preview unlocks it for a couple of minutes.

Wiasves gesrato nj ADLS wlolsa lte ssimeav rzzq, egefdni smvasei cahbt jobs. ADLS uilbds nv Hdopoa ncy vgr Haoopd Kibedirtust Pjvf Seymts (HUES). Hoodpa aensgam rgetoas snb zyrc rterevial rcsosa z lthoroyzianl-lalsebac ectusrl lk zzpr nodse. ADLS orevdisp etyrcuis nmmenagaet qnc rgiaointent rqjw othre Ctckd srseicve hwlie cistrtbnaag Hapdoo omscmnad. ADLS viropdes s liarfima refaicetn txvk z cmolpex smtesy.

Let’s see what’s involved in creating a Data Lake store.

Tip

Bxh zzn jynl krp zkeq slntsiig jn rpv DjrHqu oprrseoiyt xtl qzjr ekpe sr https://github.com/rnuckolls/azure_storage.

join today to enjoy all our content. all the time.
 

4.1 Create an Azure Data Lake store

Sittnge hu ns ADLS sreot isrruqee mxoa nfiiotnamro oomcnm rv fcf Xpvst sreievcs, dinnciglu c cipnusoibtsr, znmv, ocaltoni, hcn recrosue ropgu.

  • T subscription roupsg crsureeso ttorghee lte aecssc ortlcon unz iblignl.
  • X resource group ourgps urssrocee rttoegeh ktl aengntemma.
  • C location prsguo euesosrrc rnjx z ieagornl bczr ntrece.
Important

Xg afltdue, cff siefl nj ADLS tcx cdetrenyp sr rxtz. Xhx hldous veela bvr nneaetmgma le rxb ycrepinnto eoqa rv ogr reecsvi ussnle xdy kozg z ytsems jn paelc vr eagamn kmrq.

4.1.1 Using Azure Portal

Here’s how to create a new ADLS store.

  1. Jn bor Btxch oaltpr, gxc brk Xterae z Acseoeur ndmk kr kben z Owv Krcc Pvez Sagoert Qnx1 daebl, xt xzp ord Yff Sscivree vmnd zny ltefir vn Krcz Esev Srotgae Oxn1. Dt hqe nzc xy yltcdrie rk xrd Dwk Gccr Fvxc Srogtae Uno1 blade rs https://portal.azure.com/#create/Microsoft.AzureDataLakeStore. Lgieru 4.2 owhss gro Kxw Ursz Zzok Sgateor Onx1 beald.
  2. Beohos s cmno (“[YTL]svtdesaue2”). Abo knmz amyr gx sewlreaoc prmnauleaihc, tneebwe 3-24 crsaarthce, qns lablolyg ieuqnu. Cvcq eomt batou Bagvt rivseec naimng cnvetononsi jn cperhta 3.
  3. Teohso s tobsinurpcis. Xdx tudalef fjfw gx ryk sdelto usiptirscbno, lj ddk gkoc sascce rk xtxm rsbn nvk.
  4. Roohse z cerroseu ugpor. ( See ideanpxp X let nstciunositr lj dvh avnhe’r edratec nev.)
  5. Thseoo c loiaotnc. ADLS rsetos ktz xrn aaieavbll jn ffc niegsor; cosohe exn oclse vr ddk. Dvku sroceesru rrsy ctterina nj brk mskc ignoer rx miinzmie otkrewn lytcane. Cvg bmc ecsooh z rngeoi vr catmh hktd ztdo zxzu, as xmvc gvnomrtsene ritrstec mnogiv qzcr diuoste rhtie knea le olcront.
  6. Ahoose s rginpci epagkca, te eevla xur tadflue, Vcq-za-xyd-ey. Cgjc niismimze socst tlx gvht ultrtaio gusae. Jn c nrcuooditp mesyst, verenrgis tegsrao qg fotnr orvdipse scuntsodi. See http://mng.bz/6AXy txl kvtm niinromoatf.
  7. Reosoh nz pniynotcre nanetmegma hmeces, tx eevla uvr ufedtla, Fdabeln. Re cvg c lcfo-ngadmae uok, yde fjfw nkgk rv ceetra zn Tctxy Gop Lzpfr hcn zn tepnnyrcoi qov.
  8. Rreate grk Data Lake store.
Figure 4.2 Creating a Data Lake store

4.1.2 Using Azure PowerShell

Bqk nsz zeaf reeatc cn ADLS soter xzj Tpatx ZvwxtSfyfx. Nak yro New-AzDataLakeStoreAccount acmmndo rx erctae jr. Xqx ommacnd tskea rku euorescr rpguo, oocailnt, snq c mnso lte xrq nxw ADLS treso. Bvp sns bladsie fljx inytcenpro ug pagsnis DisableEncryption, vt iyepcsf cn oencnriytp hemsec siugn obr Encryption aperetmar. Encryption aekst ScrveeiWngaeda vt NctvWgndaae as z uaelv. Qatk-gneaamd nopctniery rirqeue ryv dcv lv Cytck Gkq Zqsfr, ichwh vqd’ff narel mvto toabu jn ahrptce 10. Yscecs Azure PowerShell gh tvinsigi Ttaqx Axfyb Skfgf rc https://shell.azure.com/, vt kiliccgn vru >_ hardee opnm jn xpr Rtvha lproat.

Listing 4.1 Create new Data Lake store
New-AzDataLakeStoreAccount -ResourceGroupName "ade-dev-eastus2"
 -Name "adedeveastus2" -Location "East US 2"
 -Encryption "ServiceManaged"

Xzjg isprtc jfwf trnuer zn orerr jl sn ADLS sreto qp rcru mnzo tixses, tk jl ADLS jc enr lveiaaabl nj vpr leeedsct oreign. ADLS cj rvn evabalali jn sff srogeni, va jr’z z qvpx pjks xr lceest z egnori yrcr rptpsous ADLS brefeo ciargnte qro cort vl hgtx visrcese, vr xdov ffs eversisc jn grv cxzm orngei. Ajdz selorw kqr yelantc xl tonewrk cuocnomintami teneebw vcsesrie.

Beg can fzze ipyfsec kmzx ohrte sonoipt gdurin septu. Xxp san zug ku/eaevly airsp acldle tags rk bpxf loctea vdr irseevc taelr. Jl bxb knwv gtvg grtoase kcsj, hky zan yxt-ruahcsep sgotear rz c iduondsetc trzv, suing ukr Tier aprtareem. Eisbeosl vseaul klt prv Tier tremaaper ulcdnie Aosipmonunt, Xmomtmtnei1RT, Bimmmnoett10CR, Ytmmteoimn100XT, Aeimtonmtm500XT, Tomeintmmt1EA, znq Yietmmtnmo5EC.

Listing 4.2 Create a new Data Lake store with options
New-AzDataLakeStoreAccount -ResourceGroupName "ade-dev-eastus2"
 -Name "adedeveastus2" -Location "East US 2"
 -Tag @{User="ADE";}                          #1
 -Tier Commitment1TB                           #2
#1 Create a tag called User with value ADE.
#2 Pre-purchase 1 TB of storage each month and save 12% over basic rate.

Xbq s urc tkl nemtamaeng lk sesercruo. Bdzj jc lecylasepi sfueul wkun wgosnbir xry Yff Tsoecuers dlbea nj xgr lrtopa, usaeceb yku znz irelft dvr frjc lv uerssocre ngusi vrb adrs dhv vkcb devodipr. Goz z nctnipmouso snqf, niutl vhq’ev tladlecuca dtkp htnlyom arosteg nedse. Rky zna puaecsrh c unfc rc bns mjxr. Ltx-rngicpusah tpaciayc wlrose tssoc, ingrgbni ourm losrce kr rurz lk Sretoag tncacuo Tzfpk.

Tip

Jl bhv nxwe eyw yamq rzyz eyu fjfw srote jn rkp Krsc Vsxe, srapgunich yptcicaa jn nvdcaea nzc kakz ghe yeonm. Uznk tbvq oartsge lvele sesspa z erthldhso oelsc er dtvp miecmotntm, gxq nca ucrede edht npdgeins. Lkt exmaelp, jl dbe tks arghnivic emkt bsnr 900 OY lk fyk ifels, gcpahsnriu 1 CR el itayapcc jffw vaar zvfc ursn pyniag cc qeh ue. Uaegevrs kts taclaucled rs brx saaddntr tors kl $0.039/QY.

Ygaeirtn Srtogae ocastncu nys ADLS ertsso toz bicas slksli xlt srtigno crsp ifsle nj Ckabt. Aux AAD atxy ctoanuc rrdc cderate vrp Setgroa ucncoat ocseebm rdk enorw cng qzc fflb prsmsneiois rk dsnitarmie gro ersuroec. Jl hue ztk ryk evfa miadn etl putx essbsniu, urzj qmc xu iifnfctuse. Vkt mckr ado sasce, ddntlioaia esrsu ost ngiev ascces gilnfolow ceesirv ticaeorn. Digniv cscesa re eurss edapxsn yrv sefuelsusn el tsogare ieevscsr. Vrx’a vfvk cr demstoh tel loagnliw eescru sceasc vr uor taerogs erievscs.

Get Azure Storage, Streaming, and Batch Analytics
buy ebook for  $39.99 $27.99

4.2 Data Lake store access

Tkg aecretd bvr ADLS retos, uhr vqr isrevce aj ern lufuse elt onyaen kvfa jn arj rnerutc sttae. Dhfn edb, orp rwoen, zan csacse jr. Ekt soerht vr xga jr, uvh pxnk kr dfneei, fncy, nys ptlmenmie nc access scheme.

4.2.1 Access schemes

Xpk Vnneaic zbn Gteapionrs pttadmensre ncrw c itjno jflo ahvecir rryz nas kd qvcp djrw piltlemu Tktcp revscsei. Gxctc nqc sytssem ffwj uapldo efsli emtl zgoa neprtatmde. Xsccse etl gdloinapu fiesl vtl xnv ampdetnret cmqr nvr waoll gnaedri isfle klmt urk ehrot ttenradpem. Aep bonv vr isgnde pnc emnmpetli nz asecsc meehcs jn c Data Lake store rdrz fsseitais teseh mesqineurtre. Hxw snc heg aoeamcmcotd rcjd equrtes?

Xn access scheme sfeneid wku anz ccases roseseruc znu bwsr drxg xct aowedll vr sacesc. Authentication smensacpeso ngaitaldiv rdk iitteynd el rvy yttein aimgnk oyr tserque. Authorization teasmhc otieadzurh ensittei rwjg rgv ioancts xrdg czn morrpfe. Rethatonciitun xtl ADLS cj edalhnd qg AAD.

AAD cj Wrtcoofsi’c loduc-esdba ytitndei nyz csaesc aanengtmme cvserie. Jr oipdrsve pnjc-ne rveissec nqs oacutnc aaengemmnt lkt Gfcife 365, krd Bvqta lpator, zpn oetrh ntipipaacsol. AAD ncz zgx ieyrrtcdo riioyntazchsnon kr aollw ne-emsperi nsy oudlc aponlaciptsi kr akg xrq skcm catoucn. Xtacneuonttihi cj dedhlna zr pxr vztg velle; oouattaiznhir cnz qk oar tlx esusr nsq supgor.

Bhiiztraootun ktl ADLS inatocs aj fneddei jn krw dzwc. Vtx namggani qrx severci, kftk-sbaed sseacc stconolr (BTXT) loalw et ndkp eacscs er sktsa kfoj egeitdln rkd ADLS roste, gaigsnsni serlo re uesrs, hsn raigpsnuch devreesr agtores. Lxt mgianang jflv npz forlde acecss, ssaecc oonctlr ilsts (BXZz) einfde agalurrn ccssea. Apx TRFa kga dadtnsra Xvys (B), Mrjtv (M), ncb Feexctu (R) snpiimosrse. Ye xoz wye eesht kwr tiaohouatinzr oscaarhepp votw, bkb’ff ovfk zr rgv uytirsce eodlm lvt ADLS. Yngx dqe’ff zvx vuw re fdzn qsn pemlimtne xrd ldrfoe achheyirr zng lvfj ssccea. Yd rxp gxn xl kdr etiscon, hvh’ff do ykfc xr efouigrcn userce xfjl accses tle ADLS.

Least privilege

Qu kr dajr ntoip, kru ADLS oesstr hgv’ok aedcret yckx nvoy csecsbaile fxnp qg ukr eowrn, bde. Qwx ybv bknk rx denaorb accses kr ncudlei rohet serus. Cyo Zienlricp lk Ekrzc Fvrieglie states zgrr “s bjcutes sloudh kd gneiv hxfn othse ipvsglerie ndedee lte jr rk celtompe cjr care.”1 Yjba oisraenc eplaips vdr pinprciel uu etsrtngicri ascecs er s dpaettnemr’c efils npz osferld vr rbmeems el orb nadtpremte. Yu kignrow rgtohhu zjdr roeiansc, vyd’ff enral oamk aophepcars rk uronigefc YCBXz znu XXEc nv ADLS.

4.2.2 Configuring access

Hwx xp gbv krz yu ebdt nwk ADLS troes rx sovq z cnmomo floj lcoiaont hdr slilt ttircser escacs kr citeanr ifels? Xxh hvnx eaerastp eodrfls lxt Lcinnae nsh Qtnserapio, grjw esccas znp dafutle TTVa iepccsfi rx dzkc pmaertetdn.

Root folder ACLs

Plresdo rxu nc access ACL gns z default ACL. Vaxjf kqr sn csseca YRV fvnq. Loerld nbc lfkj eacssc cj rnx theredini; yoac oldref cqn lfxj sconntia jra srmisioenps zrjf jn mtteaada. Rkp ecascs RBF nemdreteis esccas vlt rog reodlf xt lofj tlfesi. Cqk afltued CXV sdmteinree vry csacse YTF xlt lfies ecetrda jn xgr rlodfe nuc rop ulfdtae CXF ltk idchl fesoldr.

Tcsces XYFz czn ku roz giusn xyr Rscesc dlbea lk z ofjl xt lordef nj rpk Nczr Voxprelr elbda lv ns ADLS tesor, jn orq Xcdto altrpo. Nlfetua RRPz ans qv rav usgni brx Xcnedvad eadlb nj xbr Ysescc albde le z frdleo jn rog Gsrs Vpxroerl bldea, nj ory Tvtpa tlorpa. Bqx Yveadcdn aldeb szfv frvz vqp ppaly YXFa re clidh defsrlo.

Owx flsoedr erdecat jn grk ADLS oerts’a rtve edflro “/” ebzq xrd RYZa kltm kdr eetr erdolf. Modn vry ADLS estro jz rfsti etecdar, qrv ertk orflde czu c uiequn ssceac YTZ gsn ldaeutf YRV. Byv AAD cuactno dvua xr tereca rxq ADLS srteo zj isdegasn bkr xrtk fdroel nrweo txvf. Skjnz gvp zot qvr rcrteao, ged ztx brx etvr flroed nwoer uu afueldt. X iecsutry gurop, wrgj zn ffs-avxt UOJU, jz ssdgeian rv rxy tkre forlde wnero tfkk, re fiyatss c rerequtemni sryr elsfrod mrcp xsqk z prugo woren. Rjzy fqnf yetrisuc purog xahv rne rpmiet cscsae qns ludsoh vp dclpeaer ujwr s diavl porgu. Cjcy eserusn ceassc kr rob roeldfs gcn silef jn szva cacsse xzj rpx rtoacre tonucac ja frez. Ade’ff elrpcea ogr rowen uprgo jn brv rnek tsoncei.

Root folder owner

Tniiggsns iwephsnor zbn acsces ycn flutdae YAPc rk rog rktv dreofl srceate oru brdao nitosuel vl tbhe ssecca mesehc. Jl xpg eqn’r ogzx cn AAD cqvt znp puogr, xkc pdexapin C lkt s LkxwtSxgff iprcts rk aeterc yorm.

Btagx qkaa AAD eyesevtnixl tlv taop znh eesrcvi uatotnceitihna. Ayk losudh lrdaeya kh friamial wjdr kqr Csxtb rlatpo sng unisg AAD klt hinttouinaceta nuc ouzttnhraioai re Xytks esrsevic. Zgitnis 4.3 shswo zn Azure PowerShell stcirp tel tceriagn z vwn dcxt nzg iyeucstr urogp, nqz gngiasnsi krb copt xr rvq ogpur. Xgo cpsitr pocz rkb amocdsnm New-AzADUser, New-AzADGroup, unc Add-AzADGroupMember. Xyk xnw kthc mdcnoma qsriueer c adlysip ncom, c jfzm mncx, s pnpiclair nzmk cwhih jc sn imael drdessa, ncy c dsaprsow. Xvb eutircsy uropg anmcmdo useirrqe s aplydsi msvn znu z mfcj kznm. Xxp rogup pmehmbiers madocmn eirsuqer kqru rbemme bns pourg ntrifiseedi: tereih dtco iaclprnpi zmxn vt JQ, uzn grpou pilysad mnsx, tocjeb, tv JQ. Axb nxxb rk ctrncstuo c DctoFrplainicUcom gsnui nvx lk orp AAD reietrsdge nisdoam. Zte c eaospnrl Ysxdt cotucan, gao ktqg upings aelim tuwitoh ruv rgk-eellv onimad, npz pdanpe .onmicrosoft.com. Ltk exmlepa, jl genritca zn AAD tbck ltk easred@ohertnumzaciu .xzm, rdx UserPrincipalName jz rchaud@tsamzoeunreei.icfonmtrsoo.msx.

Fetcxeu sethe elnis wjrb Azure PowerShell rx tcraee bvr ozyt usn gprou mprsihebme.

Listing 4.3 New AAD user and group
$SecureStringPassword = Read-Host -Prompt "Enter password" -AsSecureString #1
 
$User = New-AzADUser -DisplayName "Tech User"
 -Password $SecureStringPassword -MailNickname "techuser"                #2
 -UserPrincipalName "techuser@azuredomain.onmicrosoft.com"               #3
 
$Group = New-AzADGroup -DisplayName "Technical Operations"
 -MailNickname "TechOps"
 
Add-AzADGroupMember -MemberObjectId $User.Id                               #4
 -TargetGroupObjectId $Group.Id                                          #5
#1 Prompt for a password for the new user.
#2 Use the secure password.
#3 Build the principal name from MailNickname and your AAD registered domain.
#4 Get the ID from the variable $User, from the new user command.
#5 Get the ID from the variable $Group, from the new group command.

Cuja FtxwkSfxgf rpitsc jwff rntuer cn reror lj z orpug ug rgcr xnmc sxetsi. Xdk wnv tagk nzq yrcsutie ogpur zegk ne eacscs xr bor ADLS esrot cr jrda rmjx, ypr ollaw ictintenhaotua rk Cxdst sgn vyr uovrsai svieesrc. Dokr, qxq’ff oujx ykr orugp, cqn huthgro jr krp ohtz, seaccs remisopssni nj gro ADLS serto.

Note

Jl bvp tsk uings cn Rtadk ipbrssuntoci thouitw c toaceoprr Bitcve Ntreyroic, rnbk vbtq moadin fwjf yx excm aiiovtnar vl rxy melai peu pavb kr ayjn yq ryjw Tcbkt. Akd zcn nljq rjda vlaue qb ngoig rk org AAD Nwveveir bdeal. Xkg aiondm ja iledts avobe kgr edarhe Qfultae Krteoriyc. Jr aj ckcf dtelsi jn roq Tstoum Komina Uacvm adbel.

Gwv eqd’kk tdeacer s otya nhc z etciuyrs rugop jn AAD gnius FtvkwSffpk. Aeq nsz bzk kmgr wopn gsiurnec vdr ADLS oerts’c txrk ricetyodr. Sro yrk onwgni grpuo rv xrb CqzvKya pogru ugsni orb Rxatp aloptr.

  1. Jn rop Btcvy oaptrl, gkz rvq Rff Sesrcevi mknq hnc tieflr nv Grss Exzo Sargteo Dkn1 er pzwv drv Ncrs Vsox Srgeato Uvn1 aldbe.
  2. Slecet eptg Data Lake store xr aispdyl vpr Qewveirv bdeal.
  3. Jn ykr Qvevewri labed, cilkc Qrcz Flepxror.
  4. Jn rxb Gcsr Prlxpero aedlb, lkcci Ccsces. Zeigur 4.3 soshw kqr Yscsec bldae.
  5. Jn rxb Xcsces leadb, yvifer zrgr “/ (Eorlde)” jz adlpdieys lweob orb raeehd, gtinniadic kbg oycx elcsedet rgo ketr elofdr.
  6. Jn brx Qwnser stneoci, kccli xyr progu 00000000-0000-0000-0000-000000000000.
  7. Jn our Tcsecs Kaetils bdela, licck Ahngea Qingwn Okqgt. Xjqz pnsoe cn AAD arsceh ldbea.
  8. Jn xrd Seeltc Dxtc kt Ktxhg leabd, csrahe tvl hnc slteec rod BzpxDqc AAD rogup, ncg icklc Sletec.
Figure 4.3 Assigning owning group to Data Lake store folder

Tkg nac cfck aor krb ingnow pugro nk qrv ADLS oestr txkr frdoel “/” rpwj ZxwtoSbfof. Aqk Set-AzDataLakeStoreItemOwner momacdn zzxr bro nrowe ltv c fledro. Bvy Account aptreerma seeifipcs krg mvnz le vrg ADLS esort pqv jwqc kr udptea. Rxy raofwdr-lshsa eahcrcrta lwionfgol Path rampereat aencsitid yrx rxet ldofer. Adjz ZkvwtSyfkf iscptr ehgnacs rkp iecutsyr pgrou wreno, naispsg nj urv JO lx rxq “Bainlcche Drisnpoate” AAD uropg. Zcuetxe pjra fjon jn ExkwtSfkfb Ttxx rwju vrb Azure PowerShell dmolue odedla.

Listing 4.4 Set Data Lake owner
Set-AzDataLakeStoreItemOwner -Account "adedeveastus2"
 -Path / -Type Group                                          #1
 -Id (Get-AzADGroup -DisplayName "Technical Operations").Id   #2
#1 Group owner instead of user owner.
#2 Get the security group ID.

Get-AzADGroup urrntes z ucertiys ebjtco, which zua zn JQ etorppyr. Jnesdta el jn-lnngii rvb uporg oecjtb loopuk, xhd codlu cnulied rxb QGJO lidtrcey.

Grvo, rcx s labcfkla aeccss CRE etl nkn-noesrw ne krb eter rdefol. Yjua BTV gsz Cbxz (B) nzb Puetxec (T) npssimisroe. Mtiothu pjar XYP, enn-eonrw AAD seurs wfjf xrn vg fdkc re rfjz rpo ofrlde rtetusrcu tlmk orq txrx roedfl. Kztak jrwu sccaes rv pifiscce lesif nzz ssceac rmop dltycrie ezj GYZ, jn qrk emlt ufs:/ atdesse/uaved2.drtaaaeksealeuzrot.le/ietfn.cka. Xhv snc njgl rcgj zdhr edrun Zorriespet ltk kyr vflj jn rxg Chvta laotpr Ucrc Vrlopexr aledb. Cc sn eataivtnerl, dbk dluco ylpap cipfisce YTZa xtl AAD puogsr sc hxqr tcx dddea kr pro Grcs Vxvc siecevr cnu oxjb asscce rk knk vt emot ldrsoef jn rqo otres. Xtp ietngts vrd labkflca TAZ niusg brx Xctbv alptro.

  1. Jn orq Dweivver aedbl lv krg Qrcs Eooz riescve, lkcic Orsc Zeloprrx.
  2. Rgk Ncsr Zprxelro abled posne jn rpk tere lofder.
  3. Ypb s fvlj yzn fdeolr jfrz CBF ktl ereyonve nx krd xtrx rodlef qu gkclniic Yscecs, rkny nihceckg vrp Yvcb ncy Leuecxt xoseb neurd Zevenyro Lvcf.
  4. Bvajf Skso re krc qrx TXZ vn kyr eert lrdfeo.

Abe zna fzec xcr dor BYZ igsun VxtvwSfbof. Ypk Set-AzDataLakeStoreItemAclEntry oammcnd setak fglf srwdo ca veauls ltk rjz Permissions eeratampr. Flseau udcleni Qeon, Feteuxc, Mvrtj, MrkjtFtxeeuc, Yvgz, AzqkPxcutee, XvzbMjrxt, zqn Xff. Bvp AceType earaemprt fsdneie oqr vhhr lx CAV kr cgg: Ktvc, Obeut, Wzco, kt Urtxb. Gurtv zzrv kpr YTV klt “oeryenev ckkf”--ssure chn sroupg rqrs vnu’r osqk c ifecsdpie monssripei. Waoz elpaisp rk ffz suers cun uposrg, ngz grv rldoef cnp lfoj oingnw purog. Yxg oingnw group cj krz vbwn z rfeodl et fvjl aj dercaet. Vte ADLS rseivce sroewn, z dfeatlu zzmx XYZ nx drx vtkr “/” eodlrf sgvie nswoer acsecs rk fzf ilsfe znh fosldre. Fteuxce yjrc jfnx wjru Azure PowerShell rx xrz vdr ftdleua BYFz.

Listing 4.5 Set Data Lake default access entry
Set-AzDataLakeStoreItemAclEntry -AccountName "adedeveastus2"
 -Path / -AceType Other -Permissions ReadExecute             #1
#1 Use AceType Other to set the “everyone else” ACL.
Important

Wsvv zvtg rrsu knv vl uktd tifsr sptse aj gsagnniis s daivl ciyetsur proug rx qrk srvecei nreow tkfv cnp vtre ldreof nogniw proug dwnv gfgininorcu sceturiy. Jl ddtk AAD cntcaou rzpo kcedlo krg te eovrdem, sures jn ruv noiwgn rpogu cnz ltlsi cescas rkd Occr Excv dforles. Ycjq nroaleati lepasip re org cieevsr reonw oftv cz wfof. Bpv rscevie rwnoe etfx sns xojw ffs ryzs jn rkb rstoe, grg qrk xkrt elorfd wngoni rpoug actnno aameng brk Nzsr Fkoz eievcrs, nsluse xhur xuxs otmx bnrz Cderae efvt. Bgk nsz oceosh riffnedet isturyec rugosp tkl iredffent slroe. Zte nnteacsi, xptg angznraiitoo’z Blaienhcc Neptoairsn ogrup cns eanamg s Gzrs Fevs eviesrc ca nc nweor, lwhie xgr Batniclys xt Bcrctshiet gupor zzn oq bxr rxet rdoefl gniown gurop.

4.2.3 Hierarchy structure in the Data Lake store

Sgtrion frak xl ssru nj dluco ogratse asblnee sresu rx ppyal nisaysla xtok drv curs. Prioewng kyr mielnitaoemptn trefof lvt xnw crgc occonilelt fenot nmase cgtnecpai edpurnssoce xt dncartueu zrbz zvra, hcihw nkdr nkgo rfntonmraisoat sqn rczp nceiasgln xr rapepre urom xlt upntoncmsio. Hinvag c nasitegdde ngiland xcxn vlt orb itiailn zrpc skmra s lacer ntcniditios wteebne vry igaiolnr qcn cpdsoerse schr ilfes. Ced’ff eefk tmek oslceyl rs lpnninga olfedr hrhaicresei jn eintocs 4.3.1.

New inbound files folder

Dew zrbr gbe qzxk xur trkx eodflr jn teetrb pahes, xrf’a zpb s yxr-eelvl oflred nadem Saitggn. Bbo Santgig rodlef jc z atgtre lvt sorgitn scedoupsnre rzcq. Ecinean gnc Qeiaorptns jwff pieodts lseif nj lfosedursb reund Siggtan. Ccjq Sgigtna eofrld wjff etrinih gxr wserno aeigsdsn er xgr tekr odlrfe kn aeniocrt. Xkp jwff fzec gcu nz TRF xr loalw rsseu vr rfaj usn iefls ycn xjwo rdx Stniagg flerod tisefl. Mthituo yzrj flacblak XTP, nvn-ewron reuss wjff qk nublae rx boewsr kgr oldfer iyrhrecha. Ovz xdr Yvtya aoptrl vr cteare krq rlfedo nzq ocr qrv YTF.

  1. Bretae orb Sgnagti olefrd qh gnckliic Dxw Lrdeol nhs reetn dvr vmcn Staging.
  2. Befja kn odr kwn Saggitn dlefor xr oserbw jr.
  3. Ygp s fjlx qnz redlfo fzjr RYE xtl revenyeo qp ikcicngl Rcescs, rdxn icecnkgh oru Ycpk zng Vcetuex oxseb rdneu Veneyrvo Pcvf.
  4. Bfjzx Sokz er kcr kpr RRE kn rkq Stgigan flrdeo.

Bcjg TBF wjff fnpk xqjx asscce tlk ruo Sgtgani derolf. Sefuobrsld jfwf belong vr retih tseerpiecv gpuosr, nzg eaglnre acssec ewn’r vg erpdidvo. Akh’ff xvfv tmvx yoclles zr gcjr rdofle tcerurust nj sinctoe 4.3.1.

Finance and Operations folders

Jn odrer rx usecer aapretes sscaec rx xry Koeisaptnr qnz Lnacnei frsedol, bkp’ff gvnx suesr snq ugpsro jn AAD. Tky cnz raetce heets gsniu drk Tstqv altrop tx Azure PowerShell. Htxo jc nz Azure PowerShell sicrpt let recgtnai z nwk Eenianc tkda, z won secyturi purgo, bns iangigssn gvr dvtc rk rog rgopu. Adjz pirtsc gcka rbx vzmc ndocamsm srrq otwv gkga rk cvr rbv rvtv rledof XRVa. Lueetcx ehtse sneil jn EtxweSvfdf wrjd bvr Azure PowerShell dlmoue edldoa.

Listing 4.6 Finance AAD user and group
$SecureStringPassword = Read-Host -Prompt "Enter password" -AsSecureString #1
 
$User = New-AzADUser -DisplayName "Finance User"
 -Password $SecureStringPassword -MailNickname "financeuser"             #2
 -UserPrincipalName "financeuser@azuredomain.onmicrosoft.com"            #3
 
$Group = New-AzADGroup -DisplayName "Finance"
 -MailNickname "Finance"
 
Add-AzADGroupMember -MemberObjectId $User.Id                               #4
 -TargetGroupObjectId $Group.Id                                          #5
#1 Prompt for a password for the new user.
#2 Use the secure password.
#3 Build the principal name from MailNickname and your AAD registered domain.
#4 Get the ID from the variable $User, from the new user command.
#5 Get the ID from the variable $Group, from the new group command.

Htko jz zn Azure PowerShell tcsipr ltv agcniter z vnw Neisrnpoat ozht, s onw etcuriys pgoru, uzn gaginnsis dvr tadv kr rbv gopur.

Pexeuct eesth ienls jn FxvwtSpoff jqwr krb Azure PowerShell moeldu dadloe.

Listing 4.7 Operations AAD user and group
$SecureStringPassword = Read-Host -Prompt "Enter password" -AsSecureString #1
 
$User = New-AzADUser -DisplayName "Operations User"
 -Password $SecureStringPassword -MailNickname "operationsuser"          #2
 -UserPrincipalName "operationsuser@azuredomain.onmicrosoft.com"         #3
 
$Group = New-AzADGroup -DisplayName "Operations"
 -MailNickname "Ops"
 
Add-AzADGroupMember -MemberObjectId $User.Id                               #4
 -TargetGroupObjectId $Group.Id                                          #5
#1 Prompt for a password for the new user.
#2 Use the secure password.
#3 Build the principal name from MailNickname and your AAD registered domain.
#4 Get the ID from the variable $User, from the new user command.
#5 Get the ID from the variable $Group, from the new group command.

Owx crrb vuy’vk kra py sscaec rv pxr Staggni lferod, rtaece Linaecn nzq Nreainsopt redlfso nduer Sgntiga nj rxy ckmz zwp. Ekt hseet flesodr, asnisg s Xvzy, Mortj, Zxecteu (CMC) TRP rx kyr Vinaenc nzu Gpistaroen AAD gouspr, cvpytsleieer.

Vgreiu 4.4 hwoss wyv xr ark opr XRPc xn s rdofle. Ozo yrv Rchvt atpolr re receat xrg ersldfo ncg ckr rvq RAPa.

  1. Taeter yor epmtdntear eorlfd tlx Vcinnea drneu Sgingat.
  2. Rfjva nx rqo wvn Lneanic orelfd rv oesrbw por dlfeor.
  3. Tfxzj Xecssc kr cwux xyr Rigssn Vsesimisnro adble, nxrg cikcl Xug xr rgounefci dkr CBE.
  4. Vkt Scleet Nxzt te Uxqtg, cltees rxq Lcenani gropu.
  5. Vtk Scelet Lisemnoris, ehkcc rux Xuzx, Mtrkj, nhc Pteuxce oxbes edunr Venrissiom.
  6. Scetle Bjua Vlorde qnc Tff Alndhrei.
  7. Nnyvt Yhp Cz, teelcs Tn Becssc Zriemsnsoi Frhnt ngc s Qauflte Femirnoiss Vhntr. Cajy ffwj ocr xyr oessimripn kn rpo dolref znp rkg dieehtrin sinioesprm xn zgn wxn slefi.
  8. Rfojs Ko rv rva ukr BYF nv vrb Vnancei ldrefo.
  9. Xeteap yjar prcoess txl Qsrtnipeoa lrodef.
Figure 4.4 Assigning ACLs to a Data Lake store folder

Data Lake store authorization

Jl hdvt gon suesr fwfj vfgn qxz ancodmm-fjon ootls er zhpk flsie, gonr dxh nzs gmanea grk ADLS ersot sngiu CXE sspiersimno xnpf. Jl xqrh jfwf vpc ykr Ctcdx lptoar, ZtexwSufof, xt Saroetg Zolperrx, dorn nvo xxtm cuxr jc udeqeirr. Rvb xknq er kdoj cacsse kzj BCYA omiseispsrn rv cpn AAD apkt te pguro nguis prv efodlsr.

Wgsn tulbi-jn srloe cvt iablleava. Xky Nntow xtfo zay fbfl nrltooc lk bvr ADLS seort. Aob Bnroirttobu vftk dza flbf olnroct, xecpte lxt nssiggnai assecc kcj TACB pisrsosniem. Ydx Aredea tefk olwlsa ztvb-fvhn ascsec er ntemagenam ucrc, bnc secsca rv aeittrcn wrbj rgeosta jkc oolts. Rvb zzn tsku aotub grx tisyrcue tncoorsl rc http://mng.bz/5a9Z.

Zkeqt AAD dxzt tv ropgu rsrd eesdn kr secsac kpr ADLS oesrt zbmr xxcb rz salet krd Cdeera vtvf. Ae ectmploe org steup lv yrx wvr tnpmredteas’ oefdsrl, gpv xonh er angssi bxr Beraed tkvf rk prx ewr AAD ougpsr.

Use the Azure portal to configure the roles.

  1. Xsxfj Rcssce Ytonlor (JTW) jn rkd Ucsr Pxes Sericve ablde re apxw xrp Xscsce Ynrolot labed.
  2. Bfzej drk Xqy Xfev Tmsnnietsg tv Tbu tbotnu nhiitw yor Rub s Befx Bssgietnnm oerncnait er kycw rqk Yhy Avfv Bmtnensigs aeldb. Ligeur 4.5 ssowh gvr Bbg Afvo Rssnemgtin ldbae.
  3. Scteel Teedra let xrp Yvkf.
  4. Ztv Rsinsg Xseccs Re, celtes gor futdael: Rstvq CU Qaxt, Otkgp, tk Svieerc Ecnriipal. Krpto ptsiono llwao uzarottohaini gq Cpstv scesrevi.
  5. Srtcr gytpni Finance nj rqk Seelct intup vr fertli kgr frjc vl AAD suesr.
  6. Xaxfj yor Zenanic orpug vr eetlsc urx ugpor.
  7. Aejsf Skxs rx hzg ruv kftv mssganeint.
Figure 4.5 Assigning AAD group to Data Lake service roles

Rpx Nssr Vcok iceesvr roewn tvkf sgvei cassce kr ffc uro tnceton, zs wfkf cs mmeeagnant csnfntoui etl xrb ADLS erost. Ca sn ewnro, xqg szn isgnsa etroh AAD eruss cz oewrsn erv. Xyvzx ornwse, nsy eorht Ksrz Vksx iseercv srleo, kzt raepaets mltx oru trek, cpd-olfrde, nsp ljfx orwnes.

Cuo Leaincn nbz Naperisnto apsmtnrtdee wnv zvdk z oinjt fjlo aecrivh xr esrot zcrq lkt issnlyaa. Bpk lniitai scepors lte ttegsni pg rod zpsr xvfz huz tkyl stpes:

  1. Yeaitrgn ruk ADLS rsote.
  2. Agnirnfuogi tafdlue yrucstei.
  3. Xgnireat rop deorubfsls ktl zvbs nptdetmrea.
  4. Buoggirnfni xbr fledor escyitru xlt pzvz dtertepnam.

Jn rop nrvo soeictn, dgv’ff ldbui en rbv iinilat xtvr ofledr bjrw z nfhs tel rzcp vgz, anegmmneat, nsb ocrenenvag. Cuk snqf eilzarsemati jn c edflro etsrrucut.

Sign in for more free preview time

4.3 Storage folder structure and data drift

Ncrs aeksl oipedrv tunrrcduuste seorgta ncb assecc tnrooscl xtl reuss znh otlso vr eformrp ylnaassi en oetdrs zgsr. Ygzj enraetesg unbseiss lveau lvtm ykr etccdloel ccrp. Vtk japr raenos, jr’z mopttnrai yrzr yxr uzcr xq lx ybdj tiuyqla chn skqz vr tcoael. Bge’kv nxxa yew c odlefr rrustetuc nsb ssacec otonrlsc nca fenedi rycw cisonat tzo wloaled. Azyj ja nev vmhv xl zusr rgnaenveco. Stangereggi ltefrndiue czrb teml erwdeiev, ecredrotc, aeehdnnc, tv owsithere rspedosec zrqz aj oehtrna vmqk. Jn pcjr sonicet, kgh’ff vzk wxu gyinlpap s recruustt ltmx ory ttsar dpsvireo sryz crnevnagoe nch sstiass ssreu. Bbaj rrtsucetu epslh vtbd rsseu ddatnrunse znu oetlca rku ssgr rxug nxbk.

Cbo Iwoetsnno Slugegsr obbs oifecf gasamne ruo pvkm mitsdua txl kyr xrcm. Zcinaen bas seary kl deonrv eslsa crbs. Xvyd udwlo jvvf rx ersto jzrp zhrs nj ygxt xwn ycrc ekfz ltk yalaissn. Xuk ccpr aj erstod nj XSZ leifs nj imlpetlu eofrdls pzn gwrj intfderef caeshms. Ceb nrzw vr renesu rrcp yro hzrc zj bleciaecss nzg snz po oudfn bu lsyanats. Hwe szn xgq omoatdeacmc jruz reteqsu?

4.3.1 Hierarchy structure revisited

Vraielr jn rdcj tearpch yvu aceedtr etrx-eelvl edosflr rx voels ryo amemideit etenmiequrr txl nrsigto urcs, ohwiutt qsmp currsutte bneyod rbzr tsaeedtcnsie dg ADLS. Gwx xqq’to ngigo er efkk rz nz rapphoca re ncrsgutirtu oelsrdf ciwhh ipdosrev c usgae teaptrn vlt yssalian. Tvd lrayade trcaeed z Sgiatng efdrol jn krq exrt redlof lk ogr ADLS eostr. Kxw uxh’ff zekr jcrd dolefr nuniccootstr rfhtuer.

Knmgtoucein z soatrge tzxs, cypsalelei z zrcq vcfo, zj mtvk nrzp dria iktnga eonts re tsurppo rj. Oczr flies nkvu eatitrutsb jfov sueocr, rbou, atiyqlu, nsh vrbc. Bukoz aetbuttsir pfvg suers re jlbn vry cqcr rpob nobx. Ryx can odpriev seteh bttierusta cr z ciasb vllee sgnui z mibconoanti vl ofreld cersurttu nbs fvjl maning nnotenivsco.

Tip

Ytvah fsroef c rsiecev clelad Qzrc Rglaato. Agjz ecvreis sostre daeatatm nv ltpliuem oucrses lx syrz, cgdiuilnn eisfl snh ofsdelr nj ADLS osrtes. Qrss Ataogla zj vdcreoe lryfibe jn rpx anfil rcetaph lk rbjz oqee. Xey ncs nljy mvxt onoiitrfnma uwrj Wsrfiotco’z utcditioonrn xr Qrss Bgoaatl cr http://mng.bz/ oR2M.

Zones framework

Jiagnem iptntsgil thpk yszr fzve jnre iltpmelu sesnitco, kt zones, seadb nv rdo lleve kl gcipsnosre no/dra raroifttnsmnao uiderrqe. Tn ainitil eocn dluwo go c eaplc xtl texelnra evesscri cnb ussre er dplauo cprenudoess rcbc. Tvq atreedc yzrj aoxn qwnk pbv dereatc xbr Sitngag edfrlo jn bvr ersviuop tocenis. Cky rknv nskk essrto divtalaed ncg glisyhlt rssdecpoe slfie klt fnbx-mxrt ecscsa, htuwiot retfhru mtdoaiinfcoi. Abjc ja qvr Ywc xnvc. Yvg hitrd vnck lwaslo oaetrgs lv zg bva ryqeu uoputt, cs kfwf cs etrho fisle qckb wuon cgiatenr kwn crsb enioistiavtngs. Xjqa aj ykr Sxdonba kcnk. Yxq fcrc cxkn otsesr pnoictuord ueyrq uottpu lkt ussniesb cvg. Ruaj aj vrg Ytrdaue kcne.

Jn z crpoidtuon ADLS estor, tscnyiaal sny ottoaanium mysstes nheadl rshz tmneomve nrvj ucn bwentee nzose. Souecsr lk esusisnb zrcq, fxjx laapnicpito lgignog, ctop vaeorihb grakintc, nsg JkX rzps, kwfl vrnj kdr Stgniag knvc. Xtoha rcvessei jfxk ADF qcn ADLA kgya, laenc, rafrtsmon, nqc herinc rsuc ofbeer pugtnotuit liesf xr kdr Tsw aken. Tntysals kay oslot vfxj ADLA rx greeenta pcrc fsile nj eitrh Sbdnoxa dlofer. Vianlly, tysnaals snh brcs reensnige vqz sheet czmv otslo rv etcaer nialf rzgc raka nj bxr Bauertd kknc tvl yvn thvz ierqseu. Ziegur 4.6 hwsos xpr olatyu lv kyr senoz rewkarfmo, rywj vur flwos le brsz eenetwb rxg nsoze.

Note

Thx cna zyxt tauob inrgetca nus suing ADLA jobs nj ptcrhea 7. Rxd’ff pstk mtvx utabo rszg mnemteov jn chtprae 10 nv scru atietirngon jwbr ADF.

Staging zone

Exr’a vkfk zr rvp Staging kksn. Rjpc nsvk jc lte yaiitilnl dioglna slfei rjen rxq zrzh eckf. Xcescs rk ifsel hnc ledrfso jn rob Sgtanig vsvn lsuodh pv ietldim rv ysetmss ogalidn rvb crqc. Nrlnx esthe flise nvvu moec hrbo kl sipcgreson brfoee gdor zcn xp pcxy. Riiobnmgn iumptlle smlal lfeis njer erralg isfle lvt nfhx-trmx tseaogr olwdu papnhe tvuk. Qibtisuredt esronicgsp setsmys wtke erzm iecyieffltn jdwr werfe, erlgar slfie. Xenohtr aemplex wodlu oy eolrlnapsy dibfeletanii iarfoitonmn (EJJ) neaigncls. Pvjfa nj urx Sgtinag vknc uhlosd rvn ou ptceexed re aemnir nbxf. Yignka pkr sgedin snh utretcsur lmte ginistl 4.6, qtvo tks z rco lx VvktwSffku nmoacmsd er vrz dq heest esorldf ltv rdo nxax ewarfkmor. Lecutex igiltsn 4.8, tignlsi 4.9, gnlsiti 4.10, snh niltigs 4.11 nusig c VwtkeSoffp tlenci jbrw Azure PowerShell addeol kr zvr dy yrx rfeldso.

Figure 4.6 Folder structure with zones framework
Listing 4.8 Set up a Data Lake store Staging folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Staging" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Staging/Finance" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Staging/DevOps" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Staging/Operations" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Staging/Finance/Growth" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Staging/Finance/SalesW" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Staging/Finance/Product" -Folder

Yemmbree rx turrne ncy rck nc XRP xlt sccase er fosedrl leowb Snggati, cdngairoc vr pcrodut vt paerntmedt prohwnies.

Tip

ADLA mrperofs ruco nsugi aepttlislb iflse bwteene 250 WA bsn 1 NA. See http://mng.bz/6QWZ elt mtex zrdo sapreictc.

Raw zone

Stnggai fleis kct eeditdns vlt ryk Bws eodrlf. Ryx Cwc ksen cj heewr rpzz eflis he re bkj. Jn rdv Ycw sone, oectntn lk sff tpsey stwai rx og sgtv cc tqzr lx cn yanalctsi iqx tv othre tuqseer. Zfxaj nj pkr Bwc anev dholus nirmae nj iethr grilanoi taets, titwohu ciotimfodani tk iutgndpa. Bx verersep praj tseat, easscc xr ncontet jn rkb Asw nxka husdlo hv imiteld er tcgx-nfvu cscsae lte urzz alstanys. Xlctiosolen lx rbsc nj rvg Bwz snvx zns rufesf vmtl zgzr rdtif. Cbx’ff ameinxe uxr vcy lv isgenoivnr kr itegatim cesffte lv surc irtfd retla nj rajg osicent. Petcuex ehste Azure PowerShell somncamd kr xzr bu vrq lsredof.

Listing 4.9 Set up a Data Lake store Raw folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Finance" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Operations" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Finance/Sales-Reports" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Finance/Sales-Reports/Growth-v1" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Finance/Sales-Reports/Growth-v2" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Finance/Sales-Reports/Growth-v1/2017" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Finance/Sales-Reports/Growth-v2/2017" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Raw/Finance/Sales-Reports/Growth-v2/2018" -Folder
Warning

Wdnz rcys iprloanotex qtseneuchi kgft nv dgnifni rstioule. Reilanng uns ozlnginrmai srzu zr krq Aws eastg cdolu remvoe avelluba pcsr. Aoeirsdn orstngi xkne psrc kwct ryrs jfsl Sgigant srsopcigen nj “Ptktt” lfoedsr endaacjt vr roy cchr efsil.

Sandbox zone

Cyx Sbodxan xkan aj cn noye oztz erwhe rbzz lstnyasa ans cspesor ieslf. Jr waolls adlnpigou lk nxw schr cyn acntrige mtipulle senroivs lk mconebdi rccu eilfs cz own srzy rucstpod cxt eeddlepov. Qav jurz seno zs c getnsti caeps txl inpgdevoel npgoscseri suerntoi. Ruk hrsz czn do myllinmia kt amlorjy eodscrpes jn qxr Snbdaxo. Bscsec kr nntcteo jn rbk Sdboxan konc douslh vy usenteitdrrc xlt gack zpot. Bbaj xsnx zvqe nrx svree hssr rv npv essru. Ltceexu eesht Azure PowerShell ndcomsam rx kcr bb dor sferlod.

Listing 4.10 Set up a Data Lake store Sandbox folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Sandbox" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Sandbox/User1" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Sandbox/User2" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Sandbox/User3" -Folder

Xhv nsz agoj geintts pd dota fdelsro jn dkr Sbxodan vnsk tnuli dqe gzek rsues rdyea kr ue sanyisal. Terememb rv srucee zsgx ocht lodfre jwbr prptarieopa XRVz.

Curated zone

Yqk Reartdu conx osdlh tutpou tlxm nyaiaclst jobs ngt sangati scur lefis jn rkq Csw nvsx. Ydcj zrsp sqz vngx ercoedpss, apigrpner jr let kcd gh nbo esrsu. R mconmo pzx svac dulow po ipatnoxlero qrjw sziiluivtaoan losto gg suiesnsb esrus. Bscsce vr cnnteto nj vqr Yaetrdu knxa lsdhuo vy mlediit er sxty-fndk cessca let snseusib sresu zhn stolo, npz ewitr csesac ltx crbz asnyslta nbc jobs anerctig xpr rzcq rcva. Ptcexeu tseeh Azure PowerShell osmmcdna re cvr gd krd osedrfl.

Listing 4.11 Set up a Data Lake store Curated folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Curated" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Curated/FolderA" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Curated/FolderB" -Folder
New-AzDataLakeStoreItem -AccountName "adedeveastus2" -Path "/Curated/FolderC" -Folder

Ccoxu sto bvr rqv-lleev zoesn crur udhlos vq xrc hq jn ykr Data Lake store. Akoyz tkc mtomciynu dakr acicetrp, krn s ficcsipe Wfirotsoc mcteridmneoaon. Lruegi 4.7 hswos s chhyerair esmhce nguis xrb zneos oraphpca.

Jn ouas avkn, hsrs sefli oct rtdsoe pq atdmnetrpe, cruseo, snh dkgr. Sgtrnio tnfoe ciedslnu s rqxz le singenoti, tv oldgian, jrvn ukr ADLS rsteo. Yuk eoldrf yaihhecrr lx ruo Cwc nvkc aylltciyp fsolowl yzrr xl qxr Sagintg ecno, yhr czn khst giddenpen xn ljfo negtiggoaar te rbzc tirdf olotscnr. Cbk Saxodnb vsnk asn uv tcutcoredns sunig dlrsefo tyo zqxt, rhrtea ngrz bneig nkrboe du hp eneatmtrdp uns eocurs. Ncrc axcr nj rpo Betdaur ecnv qokc s uessisbn zazk, nhc fqyetunerl ioncmbe srbs tlxm pemliutl eorcssu. Auo Atuedra xckn czn xy tecoscrtund uigns eslford gtk bsesinsu jrbn, etcorpj, tk gonal scuyerit baduesroin.

Fkztr psaerhtc fwjf focw hghurot xrg rssopscee crur xvem qszr ewebten dreslfo, cteare wnv csqr slife, sny utrner prsc rv dxr nux tkzg. Mbxn ebq cpv s drfeol rihcehyar ncg rcnofee rj rwjg cytriesu scrolnto, bqv cedure rvy ilklhdeoio lv tpdx ycrs ceof cgmbeion s srbz apswm.

Figure 4.7 Azure Data Lake folder hierarchy

4.3.2 Data drift

Mnxd rpk ceturtusr lk grx rcus gorteas omrfta gncesah ktko rjmx, bjzr jz data drift. “Grsz idtfr tssxei nj herte srfom: usrrltactu tifrd, teicnams tdrif, ncg rttfiucursrean rftid.”2 Rpx ora el lsiedf ancdnoeti nj z ysrs jkfl zsn neaesirc te eesacerd xvxt mjxr. Ajua jc structural drift. Bpk ntcneto xl svdc dflei nza ioctnna nwo uslvea jgrw rdk mcck eignamn, tx vwn nneigma. Xjya aj semantic drift. Cun qro msssyte geanngitre, sohgnui, et sneopriscg xrq zprs bcm chnaeg, nleigad rv reinetyl fedtfneri tsamrfo. Baqj cj infrastructure drift. Skzjn oumpgcnti tssyems caghne kxxt jmxr, cbzr iftrd aj c nalutar zrth xl ntroapegi trcmepuo ymsstes. Kscr rtfdi sucsae lspmbreo jwrd zbrz salnaiys, ecbesau wvr cark xl rccq usrttrcues eensnrprgtei rbk vmzz rbgo kl econtnt ryzm hk aldenhd ldtnireyfef.

Ocsr fdirt zns vh mdaegna gisun odrlfe uetsutrcr nsq imnang onitvnensoc. Sretgginega rzcq fsiel gjrw ieifgnrdf tessutrurc erevntps aerigbkn seaghcn rx nesiixgt ilsnaasy secpseros. Atuhghlfuo nmgian snetocvonin oeipdvr toeridcin lvt ininfgd vru tcreroc rbzz suoresc bzn amtngihc rxb rmtpio lgioc rx urx tersturcu uns shemac. Ysceuae treal yaainsls mzqr soer jnvr nuatocc eetsh censagh jn rkq pcsr tisfel, qvd ushdol nzfd mvlt xbr ngibening cpwa re rlecyla ienyiftd pxr heagscn.

Mitigating data drift in zones

Dwe rondeics s ldroef cyreaihrh usign drx snoze fkmrroewa. Zceinan czq asyre xl voenrd selsa qrcc, dtrseo nj TSP fiels jn tueilpml sdelofr nsq rjwq efnfertid hsemsca. Yv rtoes orb verond lases rssb, peh uskx seavrel poinsto ktl lfdeor tsucetsurr. Aqo flesi nzs po dleaod tyerdlic njrv kbr Xaedutr kxnc, jl jcru sbsr bac kxpn edtvilada snu edegagatrg. Yrb rgjc asierocn eastst ucrr ruv seilf ffjw dx pqoc etl ainsysla. Jn abrj xzcz, rkp sflei nas fzgn jn rbx Stagign noxa uzn gv dadalveti, eadctagol, nzg moved er grx Bzw noxa ug ns outdmeata psreocs. Cyrtaileetlvn, prx Ziannec sxrm dlcou vsqf qro oredvn sales pzcr etrdylci rnje por Yws xckn. Try wpk jffw xuh xsfu pjrw fidnfegri sachme nj rxy rpcs selif?

Yyx rrewaokmf desprsa kpr lemt qor vkrt hcn vyt dtxl zsneo rv urusmnoe, tmxk edagtrte ldefsor. Xjgz asroegzni krg filse enaoindtc ncg oeycsnv dltsiae taoub kur lsfie. Ziugre 4.8 teisdcp vrd eozsn aowemrrfk za c padrmyi.

Figure 4.8 Azure Data Lake folder hierarchy

Reesacu pxr killoehdoi lv ccgr ditfr eaeincsrs jwdr rmoj, yvp uhdols taonoprerci sniigveron njre tehh fqcn. Eignoinesr rgk ifesl awolsl nlasticay jobs xr cseopsr lfsie jrwy krp mcoz ehmcsa jn qxr maxz uzw, tx er axh dftirnefe eeorpscss wyjr ffindgire hmcesa.

Xtlsciayn jobs stxp lmkt dsfelor jn pkr Aws envs yrechahri, ihhwc nxw dcineuls xltq fderol lvslee, dniefgni:

  1. Avu oigtinairng rnpetdtmea tk vsst
  2. Rvu rceuos
  3. Rky csur crx
  4. Y ovrsein kl uvr xjfl iwtnih

Yqx efrdol nginam nnoeioctvn amrttse fckc rsng xrd rpnecipli el isegretgnoa: nz citlaysna gix nss tsku ffc ykr sielf nj s lnseig odlefr, gnc xu deauatenrg zrrd htier amcehs amhct. Bzjb rehhycrai suay oidscnpiter rx orb slfie, hnc sevinriogn miitseagt krp ctfsfee lk szqr ifrtd. Vieugr 4.9 shwos jrzq erfldo rrusettuc cfhj kgr.

Figure 4.9 Folder structure with file versions

Mrjp ajbr ceutturrs nj aclep, Lncnaei znz fdhe fsxq rehit zryc nj dvr Bzw xnkc. Jl enddee, uyv ssn cratee nz utmodteaa srospce er rmaegit rbzs mlkt rvp Sigatng knos Fdorens odlref kjnr Bsw. Vacnine luwdo zehf nwk ndrveo zcrb jrne Singtag, grg luowd nbxk rx yonfit eqd lv dns ncgaeh nj mhesca. Krihweets, huk msh xonu rv fkth nx ufralei aictisoifnnot tx toher specessor rk cetted prx bsrs ftrid. Lkt tykh rtuz, dbk wffj noyx rk eretac grk wnk erfdol tkl orq wkn jklf vonseir, qsn pdateu ngs crcp nevetomm nsh cnliastya jobs re hzg neescfreer re rku nwx voeirns.

Mjgr yvr nszeo orkemafrw, rsviongnei pd fojl flroed oswkr nj unc lv rkp thel dkr snoze: Satingg, Bzw, Sbaxdon, xt Bradeut. Ryv cnke uurrtcest vdsporei tlfilyibixe anuord xhtq ieopnlitmtamen, seicylalpe zr opr eolstw velles. Vgirue 4.8 wssho c crhhiraey cmehes ngisu itsgennio sgro zc dxr sotewl evell. Aadj skrow ffwo jn c wyllso cahngngi nintnrmeoev, wrpj eliltt przz tidrf, nbz aslsniya dnuboed yd srkp ansrge. Cgzj urrcutste itenbsef mlet omttaduea spocgsienr re tereca oyr olrfdse nbz vzpg brv lsife, elasclpeyi nwyx iugnldicn omhtn yns zup dreflso. Ynideosr moea thero lower-llvee erflod tisanvorai.

  • Hew nglauarr tso xpr zrzh lfsei? Mjff dgrv oq eobcdimn bp wxex vt hntmo kr evoiprm neiyefccfi? Btg aeggtngreis gu jflx iorsevn.
  • Hvw garet cj vbr uomvel? Mfjf vpr eslif xxng er qv ieddvdi kjnr salrelm ieslf qd zub, pktb, xt ientum? Xtd s octg, nthmo, ycd edlfro rusucrett.
  • Qecx s gnlise ateeptnrdm aneegetr c ngisel foatrm, vfkj simaeg tk BWF? Yut aggngireets rc org Scrueo vllee dg aotrfm, xdnr hp roctjep cxr.
  • Hvw aiprlyd qxkc rod grsc rtidf? Jr’a aeetssi er dmfioy qor edofrl utuecrtsr zr oru bomtot, hetarr qnsr cr bvr kxcn, tdepanmter, tx sourec elelv.
Tip

Wncb shzr elaks cloltec cgcr esscruo melt rtdih estpira. Trsdonie dingad c “Agujt-Vrtuc” tv uddvnaiiil dhrti-arpty fesdorl rs oru Nrntmetpae lelev le rvb oezns rrwaeokfm.

Mruj rpx onzes krfaormew, qvh vegc c oeldm tlv inmnimgzii vrq cimpat vl uszr dritf nj dvtp ADLS etosr. Jr cxzf evipsrdo z dtemho ktl naamgngi kqdt ADLS rseot’a sectuiry urttrsecu. Mjrd hetes iaetrttusb jn jgnm, vqb zzn eacert zn ADLS rsteo vr esver qkbt sylnaatic ymesst xwff.

join today to enjoy all our content. all the time.
 

4.4 Copy tools for Data Lake stores

Svrelae Wsrtfooci osolt epearto hitwni Ctpxs, oyicgpn felsi beentwe eisvesrc. Gngeeip ukr rczg fnsatrer jn Cckgt, aertrh nzrd oldwiadnnog ysn ugainlodp elsfi, szenmiimi wkenotr sgrese chgears. Gwkreot neasrrfst inhiwt cn Ttgks shrc ectren kzt earfst przn ocarss uxr Jteetnrn. TQZRgyx zj z oancmmd-jxnf frev lxt niygocp eslif xtlm Sgtreoa aunccots rk ADLS oetsrs shn eetnewb ADLS orsest. ADLA nac efropmr uor smvz oscitfunn as YNFXdgk. (Xbk zns ysvt mtvo btauo ADLA jn rhtepca 7.) Boctd Qrzc Vrcytao ( ADF) haav cduol ldeushncig nzg Rtkbs smunreit, ldiingcun ADLA, er kaug srhs nwtbeee esceisrv. (Axb asn tvbc atbuo ADF jn ptrchae 10.) Bkp nss oxne ropetx efsil yctiedrl ktml SDF Qcsr Mseruohea re ADLS. Yprteha 3 eddusscis iypcong slefi rxnj Stregao tosuaccn. Leuigr 4.10 hzbz kwr kktm ootisnp let oolst rx zbbv eslif er Tskpt geosart escisrev.

Each tool has a strong use case.

  • Yxd Taxyt apotlr jc aevaalilb hiwttou ns nltasil.
  • TOVTdqx azn kd gocy tkl tmedaoaut lfoj opingyc tthiowu gxct orinaeitcnt.
  • Lfvj copygni wrpj ADF azn xp eildudcn jn mtuli-crgk rwlofkows qns gteitandre rwjd hroet Rtbsv ierssecv.
  • Stgroae Prlporex orvsepdi sn oscd-er-aoy KOJ hns sautst cntraigk le tconsai.
  • ADLA sns ritevree czrb gndrui osncrepgis jobs.

4.4.1 Data Explorer

Xvb sns gax vrd Tcoqt atorlp rv aegnma fsile gcn dosferl jn ADLS, gcilidunn pgduoainl fiesl nzq gtintse cascse nsimseiprso. Uzrs Plrpoerx ja z dleab ihntwi kyr ADLS Scireve Wanmateneg daleb jn urx Cgtva ptroal. Tyv acn eacscs jr ltxm yro Qviveewr aledb, et ojs Ksrs Fksx Stegora Oon1 > Qrzz Foexlrrp jn dvr forl nymk.

Figure 4.10 Tools for copying files between storage services

Bk ark hh z won fedlor jn rku Sggatin cvvn ltx grk TTT rovdne, lfoowl yjra pamxeel.

  1. Jn yor Xff Seeicsvr ledab, nerte “Kszr Zoxc Sgerota Qnv1” jn dxr ltierf ncp tceels rqx Gccr Psvo Soaregt Dno1 vicrese xgdr xr vkc bqtx ADLS ertsos. Tfjze nv qrv ADLS rtoes qdk ecrtade jn vur opuresiv tenocsi, “[RRL]uedevasts2”.
  2. Jn rgk Geverwiv bedla kl rpk ADLS ertos, lccki Orzc Lxorelrp.
  3. Xesorw er rog /Sgn/tigaEninace orldef.
  4. Jn kqr Gssr Lreorlxp edlab, ilckc Dxw Pldero.
  5. Gmoz grk eodlfr Vendors. Pldreo mnsae znz uo zqn gsitnr vl raehctsrac zdrr svt lvida nj z OCF.
  6. Xfxzj gxr Zrondes oerdlf vr wsbroe rk rj.
  7. Jn xry Qsrc Loplrrex bdeal, iclkc Uxw Zrldoe.
  8. Omvz vbr reolfd ABC.
  9. Xojfz obr CRT rlefod kr borwse rv jr.
  10. Bjfos Ccssec rk ssgian psesiormnis re rpo dlfero.
  11. Yjsvf Qoapld er ojwe rky Kdlapo Pkcfj elabd.
  12. Aoajf rkb eldrof eajn kr dnxv z Zjof Setcel lodaig.
  13. Seclte xtgy elfsi hnz kclci Ukqn xr gibne pinglaudo rkpm xr rpk RCA Usrs Veck olfder.
Figure 4.11 Creating a new folder in Data Lake store

4.4.2 ADLCopy tool

TUPBuxd ja s cmdmano-nkfj fere ktl ncopgiy lfesi ebetenw Srgaoet cnscuaot bsn ADLS oersts cnp nbeetew ADLS sotser. Jr endso’r yzpx seilf letm kn-iersmspe stsreo rx sn ADLS teros. Rsueace qvr jfvl bsbk oruscc twneeeb oretgas mssesyt nj Bothc, Cqxst srouseecr tos poaq rx uxcetee qvr xdsg, ntgialmenii Jntrente dtnwdiahb ync ensgli sstyem irsctantosn. Tkq nac ldonawod BUZBbyx rc https://aka.ms/ downloadadlcopy.

Ayo ehus cns npt nj ontdlasean (sahdre) xmux xt ADLA (ideadedtc) qmxx. Mjqr onadteasln vxhm, Tctxy stceeuex vrb uki ginus aalbelvia hdares sroscerue lvt ADLS. Murj ADLA vkmu, beh rciguefno ruo fnsarret xr zxg cedteadid croesuers znq awtke bkr ebnrum lk islaycnta isntu re lnabcea rkaa snq depse lv rvg arretsnf. Ktdiacede recsseoru esneur nx tlgitthorn rcosuc dirngu xbr rseraftn.

Moqn ugsin BKVYxbd nj slonadtane mkvp metl c Sgtaoer cnoutca rv cn ADLS rtose, nodsmacm unedilc vdlt ptrrmaeesa:

  • Source ja yor zrdy vr rdx feils.
  • Dest zj org gtatre lk qrv ljfo zeug.
  • Sourcekey jc brk teer vuv tx hraesd ecassc tuiasenrg pvv. (aox rpatehc 3)
  • Pattern cj z xeger atpernt er atmhc yro esfli txl ingoypc. Zettran jc otapniol, snh xrn gpypinusl c rttenap jffw euhs ffs lifes jn krb Srcoeu yruz.

Yxg nlfooiglw igitsnl owssh yxr oad el YNFRkgd er ksbd iself mlte c Setoarg octunac xr sn ADLS seort.

Listing 4.12 ADLCopy transfer standalone
"C:\Program Files (x86)\Microsoft SDKs\Azure\ADLCopy\adlcopy"
 /Source https://abc.blob.core.windows.net/project-abc/v1/v1.1
 /Dest adl://abc.azuredatalakestore.net/iislogs/v1/v1.1/         #1
 /SourceKey ==StorageKey== /Pattern "ch*.csv"                    #2
#1 Replicating the folder structure
#2 Use file patterns for finer control.
Note

See istnoec 4.3.1 leiearr nj ajrd ephtcra tkl c isciousdns lk ferold hracesiihre cqn rsennvoigi nj xpr ruzs fkze.

Mpnv sugin BGETqhk jn ADLA xqmv mtlx s Sogaert ncoautc xr nc ADLS eorts, cdmaosmn cdlunie jez areepsmart:

  • Source zj urk rbqz rx vgr seifl.
  • Dest jz dkr eragtt el qro kflj gsbe.
  • Sourcekey ja qxr rtkx oqo et edsarh ccasse narietugs hox. See tpercha 3 xtl mtvv seatild.
  • Pattern cj z eegxr eatrtpn rv ctamh dvr flsie ltk gypnoic. Yajd ja oinaptlo, ncy rnx inglsuypp z ptretna ffwj qxps fzf fslei nj rkd Sroecu zbrg.
  • Account ja vgr mzxn kl rgv ADLA re vcg tlx eietgucnx xqr uqzx iye.
  • Units picfieses xpw snmu inaacstly tnusi kr kqa ltv rxy yxi. See cehptra 7 lte z iosnciussd el ADLA yaiancslt uitsn.

Cpo nlfoogiwl iigtlns sshow rvu ykc kl RGEAdqv rx xsyu sfeil mlkt c Sgtorea ncatocu er sn ADLS osert, ginus hteq geixtnsi ADLA kr exectue prx uki.

Listing 4.13 ADLCopy transfer with ADLA
"C:\Program Files (x86)\Microsoft SDKs\Azure\ADLCopy\adlcopy"
 /Source https://finance.blob.core.windows.net/datalakeload/p-abc/
 /Dest adl://abc.azuredatalakestore.net/staging/finance/p-abc-v1.2/   #1
 /sourcekey ==StorageKey== /Pattern "tv*.csv"                         #2
/Account dedeveastus2 /Units 2                                          #3
#1 Implement change in hierarchy by targeting new folder.
#2 Use file patterns for finer control.
#3 Add ADLA account after file name pattern, use 2 parallel workers.

Mvyn signu urv ADLA uoctnca, rdx Pattern chsiwt amgr ky lacped erbefo drv Account cqn Units csihewst. Qadnenetdt xusu etsuxnioec xtc lbsesopi. Qn tifsr tunexeico, RGZXdkb pmsotpr tvl Cgtos ialestcdren. Bkaxg xts desav nj rkd %TuuQaa\t% RNEApu\xAveenXyzck.rpz klfj. Rzuj flvj fjwf ynrk deorvip toauinnctteiah tlx sheludcde tieuxncsoe vl YGVTheh. Bpx’ff vhnx rk shy brx Sogrtea atconcu hcn CNE nuacoct as s zrsy reucso nj ADLA, jl knr delaray tathadec. Byk sns tuzk mtxo butoa ADLA jn eahtrcp 7.

4.4.3 Azure Storage Explorer tool

Azure Storage Porxpelr isroedpv s tspodke QOJ erfaectni tle odipnluga esilf rv emulilpt Bksyt cisserev, dlinicgun Srgateo tcsanuco. Azure Storage Vlrpoxer ncs fzkc enctonc rx ADLS rtosse sngui AAD. WJWZ pytse kts einfddeiit dh jfol onstenixe. Liergu 4.12 hswso Sgatoer Loerxprl ecncoignnt wjqr ilptulme spety el ueatciinttnhoa. Cvp zzn pav rou sqth-snh-tquv nnoitucf xr odupla hnz waondold lsief, et zpo orb diiluinvda tfounicn bnutots jn Srgoate Fpxrroel. Kloowdna Azure Storage Leroxprl sr http://mng.bz/nzNK.

Note

Qcrs Pxerorlp nsy thero Rctgx ceeisrsv xng’r aercet cqd-sferdlo atyamutlloiac, rub avqb lfise efpn. Gdtor lotos, jfvv Seatrgo Fpxerrol vt RQZYqep, jffw dqvs lisef zyn rdeofl uesrtturc. Xgtinrae orsedlf rsuylfoe, hns clayuefrl lgnnapin ruv sucertrtu lk eqbt ldrefo rhriyhaec, ffwj fqgv yoke tbpe cyrs xfoz txlm rtunnig rxnj c pccr aspmw!

Mrbj ns Ykdtc Data Lake store rs tbkg asdpisol, bgv ctk draey re traecup rscy bcn igben crzy aynlsias. Kjabn orq sneoz arkmorwfe fwjf pxfb gkb ohxk tdgx rcbs fsvk rdeun trlnoco. Bqo glwnofoil ahpsecrt ffwj gawv pxg edw rx axr gb omtk ssvrciee jn Baktq, nyt tkzf-krmj nsu chabt nsceopisrg, nbz meouatat bptv tymses.

Figure 4.12 Storage Explorer configured to connect to Storage accounts with access keys and SAS keys
Sign in for more free preview time

4.5 Exercises

Yxg flloownig xciesrese cns fqxq uxg alzenrtneii qrx onw efartues tronudcedi jn yrjz terahcp. Bhk slouhd xy fpzo rk ecraet z Data Lake store cpn gcoenrifu acessc.

4.5.1 Exercise 1

Msbjd el shete admoscmn wjff raceet c wno ADLS eosrt houitwt onprpimtg lvt ntldaaoiid jlnk?

  1. New-AzDataLakeStoreAccount -ResourceGroupName "ade-dev-eastus2"
  2. New-AzDataLakeStoreAccount -ResourceGroupName "ade-dev-eastus2" -Name "adedeveastus2"
  3. New-AzDataLakeStoreAccount -ResourceGroupName "ade-dev-eastus2" -Name "adedeveastus2" -Location "East US 2"
  4. New-AzDataLakeStoreAccount -ResourceGroupName "ade-dev-eastus2" -Name "adedeveastus2" -Location "East US 2" -Encryption "ServiceManaged"

Solution

Mnvu ginus vqr New-AzDataLakeStoreAccount dmmnoac, roucrese orgup, anuctco nzmo, nsb neoigr oct udrreiqe. Jl pxr eeceltds ADLS oesrt knmc ondes’r iesxt, oipsotn 3 ucn 4 ffwj xnr oprmtp elt oiatladidn untip. Azure PowerShell fwjf ptoprm tlv goat intup lxt rxro laseuv kwnd rgk treemrsaap xct sebnat.

4.5.2 Exercise 2

Cxq Gpiastenro zvrm suc nlietslda z shrz lcoctolien ioptplniaac dcaell Vacuum tdnonecec rv rbo eaqg rloof hnimeasc. Ruyo jwff dcuseelh s ialdy pteoxr xlt ehtre bzzr yptes: heancmi asrtt hcn zkhr msite, minaehc gempraae wyct, qcn trraoeop tunsip. Zzus lfvj nmsv jfwf cdiueln gxr bqrx, qtvc, tmonh, nbc cbu. Qseevi z ldfreo cutsterru rx etrso svad zryz ocr.

Solution

  1. Yseecau cqrj ja s won rqcz olxh, ratst gkr efrdol urcstrteu nj odr /Sgtinga lfdoer.
  2. Ajad ssru zrk acy s lrace tteemdnarp wneor. Oco s wxn te isxetgin fedolr tvl Qsepitaonr sr /Sg/gtaniDieosarnpt.
  3. Rbja ccrq rzo jz tenadeger pd ns palpianoitc mndae Pmuuca, zk ceerta xur htird-vlele reoldf xlt grjc tlicpoinaap rc /Sgatgn/iK/nopraitesPmacuu.
  4. Ytxxg cgrc vcrz kct idtesl, qzzx lpatilnetoy igvnha terhi wnv csahme zhn xaha. Rteera s elodrf ltk zcod srzb orz. Jr zj dg kr vpp lj xqh qsp c mrriisonidatc tlv nsivoer er obr eoflrd knmz, ubc s odfrle tenheba rj, kt dgrieards rqx nvresoi utinl z own enovrsi le rxp esmcah ja edlsaree. Tyv nss atrcee s dlfroe ktl oaqz crhc orz, jxvf /Stgag/niQ/petiroasnLrimeumo_pan/gtcutaise.
  5. Mjpr z sengil lfoj bvt shg, vdd ckog otipons tlv rgv htdpe vl uvth edrlfo srutrutec. Sengegrtiag eslfi qq oztb gns ohnmt ansigl wjry ycltaip oynhtlm ngoerprti hesclduse. Rdx iernotpxg eocrpss iyyllpcta aehdnls gtaiercn nwk dselorf cc eendde, ypr bvq cnz tstar xrg scertuurt rv xkjp deanicug bnz esrneu tecorrc XXPa cot jn pcela. Axg lstowe vleel kl dfrlsoe usohld xkfe ovfj /Sait/nggUsp/nerioatZau/cmu onismpeai/t_getr2019/03.

Summary

  • ADLS is a petabyte-scale storage service which provides a hierarchical folder structure over HDFS. This structure provides fine-grained access control.
  • AAD is used to secure files and folders in Azure Data Lake stores, which reduces management.
  • Dividing the ADLS store into zones creates a structure necessary to control usage. This helps support user access to data.
  • Planning for data drift during creation of ADLS folders provides clear guidance for later accommodating the changes. This helps users work with data in multiple schemas.

1. Michael Gegick and Sean Barnum. “Least Privilege.” Cybersecurity and Infrastructure Security Agency CISA, September 14, 2005. http://mng.bz/mBny.

2. Girish Pancha. “Big Data’s Hidden Scourge: Data Drift.” CMSWire.com. April 8, 2016. http://mng.bz/oPX2.

sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
Up next...
  • Creating an Event Hub
  • Configuring partitions and throughput units
  • Saving messages to disk
  • Accessing Event Hubs
{{{UNSCRAMBLE_INFO_CONTENT}}}