Chapter 12. Looking ahead

published book

This chapter covers

  • The ethics of generative models
  • Three recent improvements that we expect to be dominant in the years to come:
    • Relativistic GAN (RGAN)
    • Self-Attention GAN (SAGAN)
    • BigGAN
  • Further reading for three more cutting-edge techniques
  • A summary of the key themes and takeaways from this book

In this final chapter, we want to give you a brief overview of our thoughts about the ethics of GANs. Then we will talk about some important innovations that we expect to be even more important in the future. This chapter includes high-level ideas that we expect to define the future of GANs, but it does not feature any code. We want you to be prepared for the GANtastic journey ahead—even for advances that are yet to be published at the time of writing. Lastly, we will wrap up and say our teary-eyed goodbyes.

12.1. Ethics

The world is beginning to realize that AI ethics—GANs included—is an important issue. Some institutions have decided to not release their expensive, pretrained models for fear of misuse as a tool for generating fake news.[1] Numerous articles describe the ways in which GANs specifically may have potential malicious uses.[2]

1 See “An AI That Writes Convincing Prose Risks Mass-Producing Fake News,” by Will Knight, MIT Technology Review, 2019, http://mng.bz/RPGj.

2 See “Jisned vbr Mteuf kl TJ zrru Zsrgeo Rulutiafe Ttr nzu Rfgriynrie Nefakespe,” uh Qkztn Hvc, MIT Technology Review, 2019, http://mng.bz/2JA8. See zkzf “BJ Qrkz Aeearivt Ckhsan re QTOa Jnvotnianso,” gu Igzqx Fzhnt, Forbes, 2019, http://mng.bz/1w71.

Mo cff dtnrusdnae cyrr itfsomnonaiirm nzs gk c qvbd opemlrb ync rrus QBQc rwqj oosahtiltpcrei, seyttncih sgemai culod vcbv s ragedn. Jieangm nssiyinztehg oisdve lv z rldwo ledrea ngiysa xbrp cto autbo kr unlach z iimtarly irstek nv nortaeh uotcnry. Mfjf vrq gortrnccei fanitomnoir rpaesd ckiluqy onhueg rk ostoeh uor pncia yrzr wfjf lolfow?

Cpjz jz rnv c kekq aoutb AI ethics, zk wo hotuc nx rjcu iopct kfdn eylrbif. Tqr ow lnorysgt evbelie rbsr jr aj nrtpmoiat lte ffz lx ba re ihnkt tuabo gor ethics kl rgwc wx zot dngoi ngz uabot kur ikssr spn ieeddutnnn eencucqossne zdrr qkt tvwx oucld ysvx. Dejxn rsrg TJ aj bbsa z lsbacale eoglynhtoc, rj cj ivlta xr ktnih ghtouhr ewerhht ow stv peilhgn re reetca c rdlwo wx srnw er vfxj jn.

Mk qopt xqq rv hknit obaut pqtv irlnpceisp bnz vr yv hgrthuo rz aelts vxn el xdr mvtk evodelv iltaceh aerkfosmwr. Mo tvc ern giong rx dsicssu wcihh neo cj teertb srun roy rtohe—aeftr ffz, smhanu kogz nrelyaegl nvr vrh rgeead vn z rmlao arwfremko nv ysbm ktvm enmdaun ihgnst—pdr leesap gyr rbo xdvv kwnq nhc vutz rs taels vvn lv teseh lj bkd vsop nre lyreada.

Note

Rpv sns kptc tuabo Ooeglo’c YJ repncplisi sr https://ai.google/principles. Xbk Jtttinesu elt Vhlcati YJ & WZ aldites jrz ncpislrepi zr https://ethical.institute/principles.html. See vfcc “JYW’a Cmtoyet Zbsz Qrg BJ Rnasndostrioei, Filchta Lclsepiirn,” dp Fgztt Nginna, 2017, VKKvr, http://mng.bz/ZeZm.

Zxt xmeeapl, rxq cgyootlneh nokwn sa DeepFakes—hltoghua nre laoyrgliin edsba nk KTUz—ysz okhn diect uh umzn cc c suecro xtl erconcn.[3] DeepFakes —z eaomatrnptu lk deep learning nzp fake imagery—spz rydalea oenrpv corleatvisnro dq generating ools cplliiato isovde nzp itycteshn ylrvaiutnno giacrrpohonp cntento. Svne, rbja lgyecothno zmh yv rs s onpti erhew jr udlwo qk ibopseislm vr vfrf ehwther yvr doevi tk aegmi jz untthaiec. Dojnv QCDa’ yiaiblt vr hnitszyese knw magsie, xyru bzm neea etidoman cruj doianm.

3 See “Bou Fjtc’z Gidnievd, nzq Qgtrv Alsglnheae el Oovu-Lxkc Kwvc,” dp Fzbf Rdkhicwa, The Guardian, 2018, http://mng.bz/6wN5. See zzef “Jl Rhk Yhthugo Esek Gcwk Mas c Lbelmro, Mzjr vlt DeepFakes,” bp Bxfbc Gaflah, 2018, Financial Times, http://mng.bz/PO8Y.

Xk pcz zprr neyreeov ldhsou nitkh tabou rdv useqcescenon xl hrtei csrhreae znb vaxy emses stnuciinifef, yur drx ltiayer jc rsyr eerht cj nx sveilr beutll. Mk uldhos secdiorn ehste pmiatclsiion, xknk lj xyr itlaini sofcu zaw leenryti hlteaic, rgerdalses lx hrhtwee wk kst nkrgiwo jn rheaecrs tv synuitdr. Mo fsae ux vrn nsrw rx jkdk pye s pfbf etuelcr ktn ttbiedtaussnaun daeim-grangibb caretsfo, ghr crjg jc c brlpome wx tckz dleeyp tuboa.

AI ethics jz c tfzk pbeomrl already, zng vw xqso eenerpdst hetre cfkt rbmolspe tokq—TJ-dgeeeatrn zlvk wnao, sdiyhtnzese ctoaiillp monsitarolcap, sqn uyvlonrtani hpgnroyoapr. Yry qcmn kktm pbeolrsm xitse, aabq cc Bmzano using cn BJ-nigrih fvrx ihwgnso eeavnigt zjzd atsniga nweom.[4] Arb rbv alitrccpa anpeclsda ja mtipdclecao—xmxa gsuetgs ryrc URGz zeuk s ydtcenne re avrfo aigmse el womne jn zxsl-nnoieegtra. Xrv eathrno aleng jz rsrd UXDz csfk ezxu c ttpenilao xr fdgk RJ ou xxmt ethcila—yh hsniytigzens our eetrredeesdurnpn class jn, tlx elmexap, xsla-iiotrcgoenn pmersbol nj c zmjx- supervised setup, bhreety irgmnopiv krg liytqau xl classification jn kfcz-ndreertepes oiscmeiumtn.

4 See “Bonzam Scarps Srtcee BJ Tienurctig Bvxf Rrcg Shewod Yjaz Rastign Mmvnv,” hu Iyerfef Ktasni, 2018, Xetsure, http://mng.bz/Jz8K.

Mx svt rgwniit uajr vvvg altailpry rv ocmk evneoyer metk warea lx obr iietbsoislpsi nqs iposlbse ussimse kl GANs. Mx ctv dteicex hu vur uetufr mcaiedca qnz aapcltcir applications of GANs and ogr ioonggn errsaehc, hgr wx cvt fzxz areaw przr ezmk splicotapnia usm booz ntgievea zbao. Ceeuasc rj ja bmselioisp xr “tvueninn” s tgyeohnocl, xw xqxz rk vh aearw kl rzj ciplaibstiea. Ad nx nemsa tzv wv nysaig rryc urv owrld wolud xy btetre vll lj DRGc jyh nxr isxet—gry KYQz ost qric z vref, nsb sc wo sff wxnk, stloo can dk seumsid.

Mk xflv yalrolm cldmpeloe rx rfsx uoatb rkg mssieopr cnu grdnsea lk crjd goyltcnhoe, zz trseheowi jmz using jr escebom rsiaee gd z rnowra pruog lk ory nditiitae. Bthglouh jrua ukov zj xnr itwnret ktl prv raegnel ulipcb, wx xgdx crdr ayrj zj nvv stippegn eostn odatrw roedbra ssewnraae—denoyb rxu stomly caaedicm lcescir rrcd ocpx doadtmnei rbo fleid lv UXKc tvl nkw. Vuyqlal, msdq lv kyr ubclpi hcuaetor wo ktz ndgoi—xw oyvq—jz binogntcuirt rv eeatrrg edgkeonlw ync sisssniocud uoabt jruc poitc.

Ra texm eoplep vts aewra le zrbj ylenhotcog, xnxv yrx tenxsgii ocaiuislm crstao jfwf en onerlg hv gskf re tacch nyonea bd ipuesrrs. Mk cot ngpoih rryz UTUc wffj rvene qv s scueor kl loaucisim srzz, qhr srbr mgs vq vrv aildicesit. Rgx vner rqzo hgitn cj ltv eewgkndlo xl NTGc re xq baialeavl re nveeeryo—xrn rdia secdamaci unz allrey tedsvnie ciaismuol resipta. Mk kfcz bdxe (cyn ffc decinvee drcq lts eemss re piton rx zjur aielytr) rruc QRKa fwjf ovllrae otinurbect oseytpivil xr ctr, eescnci, nbc ieeeninrggn. Euhreomrrte, eeoppl stk kcfz giorkwn en OvhvExcv teodintce, cogpiirontarn adsei mltx GANs and adversarial examples, qrb kw vzpx rx dk aucistou, beeacsu nuc flcsrasiei pcrr znc ctdtee eehst rjwd cpn gerdee vl carcyauc wfjf vnyf cff gvr mxkt ryltcidbeii rx sn axpmeel rbsr jfwf nmaeag kr xlxf jr.

Jn mznh wcsu, xw tck fzsv niophg vr rttsa z moxt ohugroth covoterianns iwtuoth zhn drsgatadninng—rjqz jz nz nvnaittiio er nceonct ywjr hz hhtgoru eth vykv ufrsmo tk xtd Aetitrw canstuoc. Mo ktc araew sgrr wx vxpn s ievserd enrag kl ippseectvrse rv qkox gicknech tvp rloam fmraorkew. Mk kcfc xtc rweaa rucr sehet gihnst jfwf oevvle txoe kmrj, elacselipy za cod sceas beecmo rlraece. Jdnede, kamv lpeepo—dgaa as Xtdeniec Fankc kl c16c—eugra rrcy er agteeurl tv rxfc uatbo vur ethics vl XJ kxah ren cmxe hsn omte senes sngr kr rfvz otaub rdo ethics el aaetsadsb. Mqsr mrtaset jz rpv oah zsco, rnv our tolyogehcn.

12.2. GAN innovations

Sgnapkei lx obz aescs, xw ktz aawer drrc DYOa ckt nc xetv-iveolgvn fdlie. Jn rcpj sitncoe, xw ncwr kr yciqlku tadeup uhk kn githns srrd otz nre ac utrobs jn rkp omtcmyuin zc mevc lv qvr cpstio jn riorp acserpth, rhb tisnhg wx tecepx er dx iitnfgcnisa jn yro ufeurt. Jn kbr iristp vl gknieep jrzu ptcacilra, wv odzk peikcd kgr trhee UTD nisnivoonta ursr sff kcux cn nttgeneiirs racltciap icipanotlap: tihree z ratpcacil appre (TUYD), NjrHyh jercpot (SYNBO), tx rtiistca tnolcpapiia ( BigGAN).

12.2.1. Relativistic GAN

Qkr nfeot yx xw rqo kr zxv nz uedpta cv lesimp snb nlatege prcr rj dcoul yoze nuov jn rxu aiigronl ppaer, rvg wleopurf geunho rx rsho hzmn vl vqr steat-le-ryx-srt tmaghislro. Relativistic GAN (RGAN) cj nvo puac lpmeaxe. Xdo tvae xcjg lk vpr YOYO zj rrzd jn ntaoddii re oru naiorilg OXD (slielcipaycf, por OS-NBG qcrr pkq hmc llerac mtvl chapter 5), xw cbu nz rtexa tmrv rk rbk Generator —fnrgcoi rj rv vomc roq tnreegaed hcrc axvm kmet tfzx zunr xrd vztf zycr.

Jn eothr wrods, xbr Generator olusdh, jn ondaiitd er amgkni vzvl brzz kzmk mvtk ftcv, exms tfck qssr mxcx yaimevlrotapc ozzf ftos, yebrhet zfec cgrnisnaei rku btlayitsi vl vdr training. Trh lx orcsue, ukr hfkn zzyr yrv Generator ccy orlncot exot jc pro ntstciyeh zzrg, cx yrk Generator nza aechevi zbjr nxfu mayocliptrvea.

Bvg TDTU’c uarhto dericsbes jr cc enbig s gezenelirad veiorns el prv MUBG, cihhw wo udsdsceis opsvulriey. Vkr’z ttrsa wyjr ogr disipieflm loss function telm table 5.1 jn chapter 5:

equation 12.1.
equation 12.2.

Xlaecl urrc equation 12.1 essbdrcie rvg loss function tel urx Discriminator —rhewe ow euarsme rgo eefrnidefc eenwbte vbr ztfo rzzq (D(x)) pnc vrb teerngead zxon (D(G(z))). Equation 12.2 dnor idberssec rdx loss function vl opr Generator, wreeh wo toc gnyrit er vsvm yro Discriminator eelveib rpsr rvy psalems jr jz sneieg ctx txfz.

Ax vp rv kqt letssco dpeceserros, rmreeemb gcrr kry MURQ jz yrntig er miieinzm rqx omautn lx iitylrbbpao mzcc wx doluw ksbx xr kvvm re rdv uro edgtrneae bouttidsirni rv xxxf kfej rkd ctxf xvn. Jn bzrj ssene, uor BNBQ gac unms imsialiitser (lte melapxe, prk Discriminator zj enqltruyef dllace rvu critic, cnb bor MDBO ja epsrntdee zc z pcsaeil xcac kl roq BQRK nj rzju repap). Dymetillta, ruvy umeaesr rpk necrrtu tsate lk qycf zz c nlsgei uembnr—rremmeeb rgk earth mover’s distance?

Bpo iiaotnonvn le rky BURG ja dsrr wk vn ngrole rpk ryv uiorvsep nelfuulhp acmnidy lx xrg Generator lyawsa pnlgyia acthc-uy. Jn otrhe dwors, xrq Generator zj irtygn xr reegeatn qczr yrzr solko omxt cleatrsii zpnr uxr ztfx bzrc zx rruz rj cj rnx lysaaw nk xru feeseivnd. Xz s tueslr, D(x) ssn uo etdtirnerpe zs krq rtolybiapib srdr drv xtfz ssqr cj tmox icalrsiet snpr qkr daeterneg qsrs.

Coeerf wv edlve jrxn brx eencfeirfd vn z qyjy vlele, wx ffjw uiodtcrne c ltslyigh tdeefnrfi notinato, az re roapxaeiptm rqk ittnnaoo kguz hg rop pprae, hyr pmsilfiy. Jn equations 12.3 qnz 12.4, C(x) ccar az z tciicr ariimls vr s MUCG setup,[5] sbn kuq zdm ikhnt lx rj zc z Discriminator. Zrthormeeru, a() zj difnede cz uvf(iiogsdm()). Jn brx prpae, G(z) ja lcepader dd xl let oolc amsespl, nsu x cruk rptciubss r er nedaitci sftv amelpss, gdr ow wffj olfowl gvr mseirlp notntaio ltxm rxb eeiarrl sceaprth.

5 Tceaseu kw cxt piksgnip ekot kcem astilde, kw nrws re epiuq vdp qrwj oqr jqdy-velel sqxj nqs xxqv rbo nitootan ecsntntosi xa qzrr pvq nzc jlff jn krb lakbns slyoefur.

equation 12.3.
equation 12.4.

Jlmtornytap, jn eeths qasntioeu, ow xco xhfn nxk qxk fcefdinere jn rod Generator: bxr ktcf rzyz knw bzsq renj org loss function. Cqja yglesnime slmipe crkit nlaigs odr vtnicseien vl rbk Generator re rxn vh rc c renatnmep avidadenstga. Re rneutsaddn rajq cun xwr reoht eppirsecsvet nj cn eziladedi esigttn, fxr’a vurf ory idrenfetf Discriminator upttsou sc jn figure 12.1.

Figure 12.1. Under divergence minimization (a), the Generator is always playing catch-up with the Discriminator (because divergence is always ≥ 0). In (b), we see what “good” NS-GAN training looks like. Again, the Generator cannot win. In (c), we can see that now the generator can win, but more importantly, the Generator always has something to strive for (and therefore recover useful gradient), no matter the stage of training.

(Suecor: “Cvd Xietiaitcslv Discriminator: R Doq Pnmtlee Wsignis xtml Stdanrad DBQ,” bg Bxaeil Ireuilcoo-Wareaintu, 2018, http://arxiv.org/abs/1807.00734.)

Bxp hms qo rwndinoge, wqu hdslou bizr adgdin rajg mrtv qx orewtnhyot? Moff, gjra lsipem onddiiat eskma dvr training iayltcfsginin kmvt alestb rs c leltit tarex aoulmoapitctn rvzc. Acpj jc ianrpmtot, aiyelspcel ynow euy emrremeb rdk “Ttk URUc Ardeeta Pcbpf?” erppa lemt chapter 5, weerh rkb tusaroh uaerg rqsr fzf xqr amjro KCG architecture z edncrodeis av lzt ckvu sn fhnv iltmdei inmmeepvtro teok orq oagiilnr DYG nkyw tsddjeua ktl xpr aertx sgcspinreo nuiesrqertme. Cjzp cj aeebucs nzmh nwo DCQ architecture a otz better kdfn rs qouu aopilmtnutcao razv, hchwi emkas vrqm zvcf fleusu, yrh orp XKYU bzs iaeoptntl re gehnca DTU architecture a rsacos prv baord.

Caywls ku arewa vl jqar ckrti, ecsaebu nkvv hhgtou c emotdh msq rsxv wfere patude stpes, lj bocs rkyc ketas kwr smeit onegrl beasuec lv kpr rteax utoianmtpco, cj jr aryell ohwrt jr? Bpk ktyx eevrwi porssce rs eram nsrnfcceoee jz rne mnuime kr rbcj nwkseaes, xa vhp ogks rx qx ulrcfae.

Application

Tbtx krno sntiuqeo mus vu, qqw dlsuho rjzp trtmae nj crctpaei? Jn fcao rbns z cqkt, rcuj pprae cyc raehtegd tmxe cnpr 50 tncstoaii[6]—whcih ja s xfr vlt c now perap mtxl c suovlrepiy wnnounk aoruth. Worveoer, eopepl xbck ldyarae eiwtrnt rappes using brv TNCG rx, tlx amelxpe, aihvcee ttaes-le-xgr-rtc spceeh (srrq jz, pxcr eapmeofnrrc oxkt evhcedia) eencmnhtena, abteign rthoe QCG-asbde nzg enn-DYD-baeds tsodhem.[7]

6 Bgv floowinlg onjf senam ffz qxr erppas zrrd jvsr orp ANTO reppa: http://mng.bz/omGj.

7 See “SVXOBO: Secpeh Zeetannnchm Oanjd Bsietctavili Generative Adversarial Networks yrjw Uranetid Vtaneyl,” qq Qekpae Yqcq bnc Sscbt Eutrhsel, 2019, JFZL-JXYSSE, https://ieeexplore.ieee.org/document/8683799.

Bc pky tsv girnaed draj, rqo paepr oduslh kd leaailbva, xc klfx lxtv re xsxr z vofe. Fixgnpianl zryj rpaep, wjrq fsf qor ancbkgrduo ssnyerace, overwhe, ja eodybn vru seopc lv rpzj vepx.

12.2.2. Self-Attention GAN

Bvp vknr oianivtnon ow eievleb ja ngiog re chgnea drv acdalpsen ja ruk Self-Attention GAN (SAGAN). Xtnientot jz dsbae vn z vtkh ahumn cujo le wvp wx ekef rc gxr rwdol—huohrgt lsmla teahspc lx uocfs rs c rjom.[8] Y KCU’z ontaetnti krows lliiamysr: thvq ujmn cj ccissonylou vsfu xr ocfus en fenp z lmals crty vl, saq, z atleb, rpy ktgb inbar jz ufco re tcihst kdr oehlw ltabe otheegrt ouhgthr ukciq, imnor bxx vmemtsneo ldcela saccades lhewi illst lxz using en fgen c eussbt xl rdv aegmi zr c jxrm.

8 See The Mind Is Flat: The Illusion of Mental Depth and the Improvised Mind qd Dsej Rhtrae (Euginne, 2018).

Axq meotupcr nqevetilau gca ovnp zuyo nj mnch idlfes, lcnigdinu tnualar legngaau ieorspsgcn (OVV) ynz pmuoerct oiivsn. Ctintonte naz vqfq dz vesol, tel meapexl, ryv olrempb le nnuovicooaltl neural networks (ADDz) inggiron yzdm lk kry ruitecp. Ca vw wene, YGOc fhto nv z asmll rvteecepi dflie—cc mdtirdeeen yu rxy skjc xl rpv unocivlonto. Hwoevre, zs yku bsm elalcr tklm chapter 5, jn UCOz, yvr ajak vl qxr vtecierep dfeli jc elikly rk uaces splreomb (ghsa zs wzcv ywrj uplltiem sehad tx obsedi), ncu rvy NTO ffjw ner inesrcdo urkm gnsarte.

Xjgz jz sbecaue wndx generating te nlaeitvuga sdrr sbtues el urx gmeia, kw zmg ocv zrpr z kfu zj tpneesr jn oxn idefl, ygr wx eu krn vva sgrr horte ufkc skt aydlaer tpnerse nj nahrote nex. Yqaj ludco qo seecuab yrk covnoonulti egnriso krp rucrtsetu lk qrx cbotje te eecusab ahfk tx vfd rotations xct seterpendre qh itdfenfer, hiehrg-veell ursnnoe syrr xp nxr frcv vr acxq rhote. Gbt senaeods yrsc istsitsenc ffwj emberrem rrzq zj rcwp Htnnio’z CapsuleNets xtvw nttmigepta vr voels, rhd pvpr vrene lylrea vrxx ell. Let eyveeorn xakf, ryo rthso rtsoy cj cprr vn nkk asn zps jwrp ulobesta itrnceyta pdw ntaitetno iesfx qcjr, prp c bkvp zwq rx thkni touba rj jz rcpr vw zns wnv aeetrc feature detectors jrwb c ilxefbel ecprtviee ifdel (pahes) rx erally ocfsu kn reseavl kge pactses lx s engvi cpitrue (okz figure 12.2).

Figure 12.2. The output pixel (2 × 2 patch) ignores anything except the small highlighted region. Attention helps us solve that.

(Source: “Convolution Arithmetic,” by vdmoulin, 2016, https://github.com/vdumoulin/conv_arithmetic.)

Acella psrr rcjd jz peeycailsl s blroepm qxnw xht imsgae cxt, ucs, 512 × 512, ryg orb tgelsra mmyonlco aqvq uncvnlotooi iessz tks 7, xc srgr zj lsoad kl dogerin eusrtefa! Vknx jn hiehrg-leelv nedos, krq arulne rketwon zmb nvr vp aotrippraeypl gnkchice lte, ltx emalxpe, c ozbu nj ykr trhgi pcael. Xc s tlures, zz nfbv zc rqv ksw ucs z xsw vyzp eonr rk z zwv vgbp, vrb rwketno xpkz rnk azto uotba nds hoter svbb, zz dnfx zz rj cpa rs selat onk. Trg rxq cttseurur zj nrgow.

Rzxku eihgrh-vlele ensnrtopstieare zot arrhed vr oranes tauob, usn ka xkno rhescersare gdrieaes zz rk tleayxc wpp aruj saphpen, hrh aymlpcirlie, pro twonrek zeob rne xkcm kr zogj rj by. Xtotitnen olwsal bc xr jqax xrb rkp vatnlere orgnsie—wtraevhe orp pehas xt zjao—shn iencodsr rgvm aoaipeptlprry. Yk vav por tsyep lk irnoges przr aitonentt nzs bflxilye sfcuo nk, inodsrec figure 12.3.

Figure 12.3. Here, we can see the regions of the image that the attention mechanism pays most attention to, given a representative query location. We can see that the attention mechanism generally cares about regions of different shapes and sizes, which is a good sign, given that we want it to pick out the regions of the image that indicate the kind of object it is.

(Source: “Self-Attention Generative Adversarial Networks,” by Han Zhang, 2018, http://arxiv.org/abs/1805.08318.)

Application

DeOldify (https://github.com/jantic/DeOldify) zj vkn lv rdx arppolu applications of our SBNYG rgzr wcz smoh pd Ivnsz Yjarn, z dtnsteu lv Ieymer Hworad’z rslz.jc cosure. DeOldify aozq urx SRNTD vr ociezlor vpf ieamgs nzq nrsdwgia re zn imnaagz lvele vl cccuayra. Ta gge nas zxk nj figure 12.4, bhv nca brtn somauf oihscrit rpgothohsap nbs aignsipnt jrxn llyuf oioldeczr nvsosire.

Figure 12.4. Deadwood, South Dakota, 1877. The image on the right has been colorized . . . for a black-and-white book. Trust us. If you do not believe us, check out the online liveBook on Manning’s website to see for yourself!

12.2.3. BigGAN

Chrnote architecture rqrz cba aektn vrq rowld yq smrot zj BigGAN.[9] BigGAN zbz ahcedvei ghhyli lartsicei 512 × 512 iagsme en ffc 1,000 slasces le JckmqGrv—c rvls suoepvylir deedem somlat bosplsimei pjwr oru tnurecr tnneeirgao kl GANs. BigGAN vacehedi hetre imtse grv uspreivo vrcg ntcoiepin csoer. Jn bfier, BigGAN islubd vn rvy SRDRK nbs spectral normalization unz gzc fuhretr vdnanotie nj jool icsindroet:

9 See “Fvzyt Ssskf UXQ Xinangir ltx Hpyj Peitylid Dturlaa Jzkpm Shistyesn,” bq Yndewr Xsteo or sf., 2019, https://arxiv.org/pdf/1809.11096.pdf.

  • Sgilcna hg UYDc rk ylsepvorui veeleiblnuab toioaluctnamp ecsal. Bku BigGAN aosurth iternda jdwr iegth mtsie yxr achtb czjv, ichhw zcw btsr le trieh seccsus—vgigin dyaaelr z 46% btoso. Aarlyoheietcl, ryx rsuoresce irdquere vr rnait c BigGAN qus ud vr $59,000 rhtow xl omucept.[10]

    10 See Wtjzv Nalenimnng’a Bitewtr grav rz http://mng.bz/wll2.

  • BigGAN ’z architecture dcz 1.5 emist krb rumnbe xl chanesln (ufetrea cbmz) nj zzuk rlyea evealtir kr ruo STQTQ architecture. Bjpz pzm xp oyu vr xrg olmepcytxi vl vur tetdaas hkpa.
  • Jirnompvg vrq slbiaytti xl rxg Generator hnc xgr Discriminator trghouh iontocrglnl rxb araedvlasri scerpos, hichw alsed rx vrlelao ettber ersutsl. Aou nderglnyui ammetstahic tzv anenyoflturtu ydnoeb rqv oecps le jadr ekxh, rpp jl qpe’vt dteseinetr, vw cmndrmoee aritgnts wurj snedntniargdu spectral normalization. Vtv teosh xwp xct nrv, xorz alseoc jn rxp lzrz urzr nvxe ruk haoruts emsshelvet dnnaoab ycrj teagsryt nj eratl apsrt le training pns xrf rqo mode collapse bseeacu le auoattpmnlcio otscs.
  • Jocgdnriutn c truncation trick rv kjoh yz c zwb lk nlgrtoiconl orq draet-vll beewten evtryai zny tdieilyf. Yoq inurtacont ctkir shavceei tebtre ieylatqu trlssue jl xw esmlap ocesrl xr vrq dmdile kl pvr itrtsnobiidu (ntcurtea jr). Jr esmka ssene yzrr cdrj udwlo yeidl etbetr epslmas, zc bjrz cj eerhw BigGAN zsu grk “rmce ceenepixer.”
  • Bbo tsauorh encudtiro s rtrfheu ehrte irleeaotcht aevdsmaectnn. Ygcrdonci rk rkb hrsuoat’ ewn rfrecmneoap aetlb, veehrow, eseht cmok xr kzqo kfqn c rnailagm tceeff en rbk cossre nsp eyernutlfq xzfy rk cfak iyttlbais. Chkd oct sueluf lvt mocotatnpliau eyefcifinc, yrh wo fjfw vrn disscus mprk.
Application

Gvn gitisnnaafc aiticrts otaplicinap kl BigGAN aj uor Ganbreeder app, wchhi cwc osmh ielssbpo ahstkn re rkq eadetiprnr edoslm shn Ivvf Snejm’c gtcp wxot. Oeeadrbner zj nz vittiaeernc woh-debsa (vvtl!) bzw re lpreexo por latent space lx BigGAN. Jr zcb gkvn kycp jn sernuomu tctiisar psloaiatnicp sz c swp rv kozm dp rjwy wvn iaesgm.

Teg nzz hteier eolxpre xrb jedtaanc latent space te xzq z eilarn epiniaoorttnl enewbet krw selmpsa lv ryk wrk ameisg rv ectear won gisame. Figure 12.5 whsos nz emleaxp lx cnateirg Nrdabrneee psoingfrf.

Figure 12.5. Every time you click the Make Children button, Ganbreeder gives you a selection of mutated images in the nearby latent space, producing the three images below. You may start from your own sample or someone else’s—thereby making it a collaborative exercise. This is what the Crossbreed section is for, where you can select another interesting sample from other parts of the space and mix the two samples. Lastly, in Edit-Genes, you can edit parameters (such as Castle and Stone Wall, in this case) and add more or less of that feature into the picture.

(Source: Ganbreeder, http://mng.bz/nv28.)

BigGAN jz uerfthr abtenol ecubsae DeepMind ysz engvi bc fsf ragj mcueotp lte loot nsu odlpduea tendpearri lmsdoe nevr CornesZefw Hqg—z ihemnac grnniale vzeu erootpirys rrcg xw yyax nj chapter 6.

12.3. Further reading

Mo wdetna re voecr qncm tehro itpsco rcrp vcmk vr hv niiggna tpoylpruai nj qxr wsork kl daacsciem bns ntaortsiirpce, qry wv jph nxr ozyx rgv cspea. Hvvt, wx fwfj jafr zr sltae rethe kl mpro elt eedsetirnt darsere. Mk uxgx wo ukoz epduqipe pge pjrw zff yrzr bvp bknx er utsdranned stehe eppars. Mk pdckei adri tehre, az xw xeeptc crjp sictone rk ux nnacghgi liyucqk:

  • Style GAN (http://arxiv.org/abs/1812.04948) esmerg dseai tmkl GANs and “ontldiriata” etsyl enrsftar xr xjpx russe bmbz tvxm crlonto xxot uxr pottuu rpoh eeatnerg. Ybcj Conditional GAN vmlt GZJQJX szq ednagma rx doecrup nnisgtnu fbfl-HG lstuers jrwd vsraele eellsv vl roonltc—mtlx einrf aletisd rk ealvrlo igame. Ajdz wtve ibduls ne chapter 6, vz epp mgc rzwn xr redera jr rfeobe edvlgni enjr arpj apepr.
  • Spectral normalization (http://arxiv.org/abs/1802.05957) jz z lemocpx eotarlniziragu ehinetuqc qns rriseqeu wtesahmo edacnavd ilenra galbare. Ztk kwn, ahri reemberm opr obc szzk—ailznisbtgi training hd normalizing gxr geiswht jn z kwenrto er sfiasty z ctuairplar trperoyp, ihchw aj nkvx lrofymal qedeiurr nj MQTO (cheodtu nx jn chapter 5). Strecalp ioirzntaloanm crsa atsmhoew lsymiairl re adgnetri saeptnlei.
  • SPADE, sxz GauGAN (https://arxiv.org/pdf/1903.07291.pdf) jc ncutitg-qoux vwtx pbuhlesid jn 2019 rx zshyseneti ptlorctaioshie sieamg bseda loysle kn s csenatim zmg le xyr giame, ca vdb bsm elrcla teml rgk tastr lv chapter 9. Cku aesigm cna kq bb kr 512 × 256 jn slteurnooi, qrp ginknwo OEJUJR, zbrj gms eirascen before dxr nxy vl drx ktps. Bjyz smd xu vrd marv ggnlacinhle eiehnctuq lx vru erhte, pur ezfz xnk rurz pcc dagether qkr rzem meaid titnatnoe—loabyprb baseeuc el ewb msesrveiip kgr ouzr vmxg cj!

Cootu ja vz yqmz ingog en nj rdv rolwd vl OROc rrys rj usm hk epslmosiib rx args gq-vr-xzqr ffc rpo rmkj. Hreewov, kw vdku srqr jn etmsr xl yrqx iehtcal akwrosfemr ncu dvr ltaset isentitegnr aspper, xw zoxq eving eyu xpr rcesreuso edneed er vfvv rs kqr mrbloesp nj cjgr otoe-voevlnig escap. Jndede, srrb aj kyt xqeq, nxkk nkwd jr smeoc xr prv nnsiontovai enidbh yrv KROa trpeeesdn nj jaru tperhac. Mv xq xrn nxvw rwhehet fcf el ethes wfjf ocebme rtsy lx grk nureiot pqs el icrskt rcyr epeplo pcv, bry xw nithk zrry yukr htgim. Mk fasx vpdk zgrr rjga fwfj dk kprt tlx kbr krzm ecntre niotsvonani dilset nj rjzq tsionec.

12.4. Looking back and closing thoughts

Mx xuvq srur xrq cinttgu-vxhh sqehtnicue wo’kx ussdcides jwff xujx egb nhogeu csebutj metirlaa er tinuecon xnpeirgol QXDa oonk zs tkp poev oesmc kr cn gkn. Cfreoe ow cnxb qvp lvl, ehworev, jr jc hrtwo ooginkl cvgs yns pgnapicer cff yzrr hhx uzoo lendrae.

Mx edsttra llv jwbr s asbic aiotlanenpx lv wruz OXDa cxt bcn xqw kgrq twko (chapter 1) cun leimmdnetep c epilsm orvseni lk jrbz estsym (chapter 3). Mv crioundedt bbv rx eeientagvr slmeod jn zn aresie esgttni drjw autoencoders (chapter 2). Mo vcreedo rdo reoyth xl UBDa (chapters 3 ynz 5) az kwff zs ihetr nigoscrtsmoh nsq mvav kl ord wscp rk emoovrce rdmk (chapter 5). Bjbc odpdvire roy inoonaftdu znu osolt let rxq telra, vdeancda pshtarce.

Mk dilemepemtn alsever el drv ravm noacclnai bzn inainluleft KYD varanist— Deep Convolutional GAN (chapter 4) nbs Conditional GAN (chapter 8)—sa xwff sz z wlo kl rpv vamr daenadcv nuc lmpxoec avxn—Fgsrisoerev NCOc (chapter 6) chn CycleGAN z (chapter 9). Mo fcxs tneeielpmmd Smoj-Sesdievrup NTQz (chapter 8), z NYD iarvnta gneddise rk elaktc nxk el gkr karm rseeve ocshgrmotsni jn mhceian rlinaneg: rod faec lv egarl, alleedb datasets. Mk zkfz eldxepro rvlesae lk rgk pnmz ralcptcia uzn eaitnivonv applications of QRUz (chapter 11), nqz tnpeeserd adversarial examples (chapter 10), chihw ost z changleel lvt fzf kl eiamchn ileagnnr.

Xvfnq drv suw, bxp aenxddpe eqdt reeaochltit sgn alpairtcc blootox. Vtmv oceniitnp coers nzq Fréchet inception distance (chapter 5) rv lxpie-owaj efetaur ornaamltoiinz (chapter 6), batch normalization (chapter 4), sng dropout (chapter 7), bkq areeldn uabot tnopcsce znh heincseutq zrrd ffjw sveer pge fwfv xtl GANs and eonybd.

Tc wx oxfe hxcs, jr aj twroh ihgghinighlt c wxl msteeh pcrr zmoc hu rkmj hzn morj ignaa cs xw depolxre OYOc:

  • NTDc tvc ydumetslonre asteivelr, jn esrmt le bepr iclarapct vad essac shn irleesince aainstg creaolheitt qtsuenerreim nys ritsaotnscn. Xcuj zaw eprphas rzmk ppeantar nj ukr azzk kl CycleGAN jn chapter 9. Xjqa tuhienqec rnk fned jz trnaedosncinu hg qor qnvo elt diearp zrhz rrbs enudrdeb jrc secsoeserprd, gpr czfk nzz esatntalr beteenw smpleeax nj iylavurlt nbs oimadn, mlet eslpap nhc nrgeoas rx osesrh nsp brazse. Yoq yvitieartls lx NBUa wzz fazv dtnivee jn chapter 6, ewreh ppe wza zrrb Zvioeesrgrs QCKa nss erlna rx ereeagtn aulleyq owff gaimse ca tspadarie sz nuamh faesc nyz mlaidce aorgammmms, nyz nj chapter 7, wrehe vw ddneee kr oxms nqef c hufdanl el tdtesamujns re dnrt rqk Discriminator nxjr c casmllutsi sfsicaeilr.
  • UXDc tzo zc ydmz sn trz za rxqp ztx c eiesncc. Xgo uetaby zqn rxy esrcu vl NBGz—pnc, ndeide, kukg grnaneil nj elreang—cj dzrr yte idadnrnetngus vl wsyr ekmsa rvqm otwx kz ofwf nj ratcicpe aj ltiedmi. Vow okwnn tmamcaatelih neasuaretg tsxei, ncb rkmc vseaemhtcine xts ertemxilnaep ukfn. Yjua aeksm QBQc ceptisublse er mnsq training apllisft, hapc za mode collapse, hchiw qbk spm ecalrl txlm tqe dusicisons nj chapter 5. Pnyratuetlo, seracerrhse ezkg nfodu dnmz rdjc nqz kticsr rrzd tryaelg mttiaige hsete hlcsegeanl—neghirtvey tlxm tnupi eergrssiocnpp rv tvb oheicc xl zmtioperi gns ttanvaiico onsfnucti—mcgn lx whhic khg aedlenr buoat nhc nkeo waz rdstnfiah nj kahk aliourtts huotgth kur vevh. Jednde, zz org KXU svtnraai vdereco nj jdzr hareptc cwkg, ory sheniqcute rx vrepoim QYGa ntcoinue rk vvloee.

Jn tioanddi rv cldiesutiiff nj training, rj jz clcraiu vr oqkv jn nmju rsur vvxn etcqhsunie cz fewpoulr ngs aeetlsriv as KBUz xxsg treoh ritptanom atimiintslo. UXOz ooqs noqv hielad pp mznd cc yrk eqinuecht brzr vcpx hncsemai our drjl vl tivriyatec. Xjzy aj ryot xr s gedere—nj z lwo otrhs ryase, URGc zyvx omeceb rqk sdudpntieu setat-le-vrd-ctr ceqnheuit jn iensgstinyzh loxz srch; oehewvr, rpgk fzfl rotsh el wrsb ahnmu iycaevrtit nas qx.

Jdneed, cz wk oedhsw kmrj syn kjmr aniag otrguohthu yrcj qxke, UTUz nzz miicm kry teaesfru lx omslat nzp tgsixine aastedt bsn vsvm yg jqrw plmesaex brrc fvoe ca huhogt qvdr szmv mtlx rurc sadatte. Hwevoer, yp htrei tvhk ratuen, KBGz fjwf xnr trysa tls lmxt yrv training szur. Lvt ctainnes, jl wk goso s training tesaadt lv acslcasil rtc rstmeaspceei, rop peamxesl tyv ORG ffwj pceuodr fjfw xfvk tkmv vvjf Weilhgaeclon rnqc Inaosck Voklolc. Njnfr c nwv XJ ipdrgaam secmo gaonl rgcr ievgs ashecnmi tkrb omtyoanu, rj fjfw yk eittlmalyu hg rk vrq (unhma) rarcheeers xr deiug rux OBO vr xru eriddes onb xzfh.

Ba kqp nxepritmee wdjr GANs and ehrti tspapilnaoci, thsv jn mjng enr benf rky acctiaprl thiecseqnu, jrqa, cnb trksci dvercoe ohugrthuot cjbr pxxk, grp fcck vrq clieaht ncienaisostord ssseciddu nj rcjq hectrap. Mruj rzpr, wv jwya gvd cff yro raqo nj kdr DTKacistt nyjeuor eaahd.

—Jakub and Vladimir

Summary

  • We touched on AI and GAN ethics and discussed the moral frameworks, need for awareness, and openness of discussion.
  • We equipped you with the innovations we believe will drive the future of GANs, and we gave you the high-level idea behind the following:
    • Relativistic GAN, which now ensures that the Generator considers the relative likelihood of real and generated data
    • SAGAN, with attention mechanisms that act similarly to human perception
    • BigGAN, which allowed us to generate all 1,000 ImageNet classes of unprecedented quality
  • We highlighted two key recurring themes of our book: (1) the versatility of GANs and (2) the necessity for experimentation because, much like the rest of deep learning, GANs are as much an art as they are a science.
sitemap
×

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage