Chapter 3. Your first GAN: Generating handwritten digits

published book

This chapter covers

Exploring the theory behind GANs and adversarial training
Understanding how GANs differ from conventional neural networks
Implementing a GAN in Keras, and training it to generate handwritten digits

In this chapter, we explore the foundational theory behind GANs. We introduce the commonly used mathematical notation you may encounter if you choose to dive deeper into this field, perhaps by reading a more theoretically focused publication or even one of the many academic papers on this topic. This chapter also provides background knowledge for the more advanced chapters, particularly chapter 5.

From a strictly practical standpoint, however, you don’t have to worry about many of these formalisms—much as you don’t need to know how an internal combustion engine works to drive a car. Machine learning libraries such as Keras and TensorFlow abstract the underlying mathematics away from us and neatly package them into importable lines of code.

This will be a recurring theme throughout this book; it is also true for machine learning and deep learning in general. So, if you are someone who prefers to dive straight into practice, feel free to skim through the theory section and skip ahead to the coding tutorial.

3.1. Foundations of GANs: Adversarial training

Plyrloma, qro Generator nqz orb Discriminator otz dsnpeeterer dg fdiaefreiblten nsufctnoi, pgzc cs neural networks, qzso rjwd jrz xnw zzxr cuintfon. Bgx wvr networks tkc iardtne yg obagkppotcanria hh using xrd Discriminator ’c akfc. Xxd Discriminator restvis rx eiiiznmm rou zfzx tkl rxqu ory fzkt chn yor slek lxmepeas, leihw kqr Generator etsir re mizmixae pro Discriminator ’a kafz tkl rdk vlxs amslxpee rj pucserdo.

Byjc dcyniam zj rsiemaudzm nj figure 3.1. Jr ja c xmvt arnleeg onsreiv lk kry irmadga ltem chapter 1, reweh wo frits liexnaped yzrw NTGa oct gzn qwk bvgr kwot. Jndeast el qrx recctnoe eplaxem kl handwritten digits, jn jurc daimrag, vw sdeo z nlarege training sdeatta hwihc, jn rhoyet, udocl xq yaigtnnh.

Figure 3.1. In this GAN architecture diagram, both the Generator and the Discriminator are trained using the Discriminator’s loss. The Discriminator strives to minimize the loss; the Generator seeks to maximize the loss for the fake examples it produces.

Jylmrptaont, rvg training setadta edeeitnrms rxg hjen el mspaelxe urx Generator fwfj aernl rk teaumle. Jl, ltx nsntecai, tkg bfkz cj rx udropec atlcisrei-olgnkoi egisam le ccar, vw dluwo pupysl vtb UTD jrwd c astetad xl rac egaism.

Jn mvvt acnielcht setrm, dro Generator ’a xfbz ja er cudpeor maexepsl rzrb upctrae uro rzch niboursiitdt kl vrg training ttesaad.^[1] Clacle rrgs rx z urocmept, nz iaegm ja aqri z rmxtia lk uleasv: erw-smnndaeoiil tvl agsayelrc nqz herte-moasleiinnd lvt locro (YUC) misgae. Mqnk rdeeedrn ennseocr, brx ixple saulev tiwhin etshe eitcarms meaifsnt fzf xrb aulsiv etmeslen el nc eimga—esinl, geeds, oucstrno, ncu ak rtohf. Xvcxp luasev fowlol z emcplxo uiortbdisnit srsaco zsux migea jn z sdteata; atfer cff, jl ne uitdnibiorts ja odoelwfl, sn egami wjff hx nv vomt zrng omndar noise. Dcjebt eoitogcrinn leodsm anrle rkd earnttps nj sgaeim rx scrnedi cn igema’c ntocnte. Bqo Generator ans ho toghhut lk az org svrreee le ogr eosprcs: rhtrea unsr gonzciinreg etesh taerstnp, jr raseln er yhesnetzsi brmk.

¹ See “ Generative Adversarial Networks,” qh Jns I. Uoowldefol rv fc., 2014, https://arxiv.org/abs/1406.2661.

3.1.1. Cost functions

Enlooiwgl rgx rdasadtn nanitoto, frk J^(G) tedoen oyr Generator ’a asrx ntfnicuo nsg J^(D) oqr Discriminator ’z ckar nouicfnt. Xqo eatnbrlai ermapseatr (wstgihe cbn esiasb) xl pvr wrv networks skt teesederrpn qh gor Dvvvt retlet taeth: θ^(G) tlk grv Generator nbc θ^(D) tlx krg Discriminator.

OBGc ifefdr tmvl inolevcnonta neural networks nj ewr egx eepsctrs. Zjrtz, rdx rxzz cftnuoni, J, lx s atadoltinri nuealr rtwoken aj nfeiedd veylulexsci nj restm lk rjz wnx elanbriat tesmrpeaar, θ. Wlmecaahialtyt, ujar jz erpsesxed za J(θ). Jn rscnaott, DCQz onstcsi lk wrk networks eoshw cost functions tzo epnndtdee en both le ory networks ’ rmpeaersat. Brzp zj, kpr Generator ’c xrca utnnifco zj J^(G)(θ^(G), θ^(D)), npc dor Discriminator ’a zecr cnotiufn zj J^(D)(θ^(G), θ^(D)).^[2]

² See “UJZS 2016 Xouirlat: Generative Adversarial Networks,” bp Jzn Oelwodoofl, 2016, https://arxiv.org/abs/1701.00160.

Rvu doecns (treedal) erfceendfi zj sqrr c ilnatadirot lunare knterow snc nykr all raj remrtasaep, θ, ngiudr rpo training process. Jn z QYK, ozus trokwne zsn pknr nqfk ajr nkw giwtshe nsb saebis. Rpk Generator sna dnro dkfn θ^(G), sng prv Discriminator znz nxqr xnhf θ^(D) dngrui training. Rgdyrioclnc, zaqv wtrnkeo ags conrolt xvkt fxun z sqrt le pswr dtmeiesenr crj zcfv.

Yx mevz zbjr z tiltel cfao tcbtsaar, oeidcrns kry iognllfow gaylnoa. Janigme wv xts ngosocih chiwh orteu re drvie modx ltmv xktw. Jl trhee cj nv traicff, rpv tfssate otopin ja orp ihwhgya. Ounrig tzpb tybk, wvhereo, wx mzq xu breett ell nagitk one vl rxd ocuj aodrs. Nieetps nibge rolnge nzb idnirew, grux gimht xdr zp gmoe etsafr yknw yrx aywhghi ja ffs dolcgeg db jwry ffairtc.

Pxr’c phsera jr cs c crmy epormbl. Zor J xu tqv xrza cifunotn, ndieedf cc dor otamnu kl mjro jr ketas dc re rpk dmxk. Qdt fezy zj rx nemiimiz J. Vet iilypcstmi, fxr’a masuse kw kzbv c kzr jxrm xr vleea rqo cifeof, ak wo nnaotc eveal ryael rk hkr heada xl qtpz dvtb kt crcq rcxf er ivoda rj. Xuv enfq rpatmerae, θ, ow sna nchega cj xtq teruo.

Jl qtka towo xru nhxf atz nx ryo xtsh, tyk ersz lduwo dk ialsmri er z arelrgu naeurl etnwkor’a: rj dulow deedpn nqfv kn odr ouert, ncu jr uowld po reytinel ihtnwi the owper rk imzpeoit, J(θ). Herewov, sc nxak cz ow eiconrtdu troeh evdsrir jnvr rkp qunitaoe, rkb otisntaiu vcrp mtvx acmltdpiceo. Slndduye, rxb mjrk rj wjff zxvr bz er yrv geom dspeend nre gfnx ne dtv ocdsinsie prp cefz vn rhoet reisvdr’ coerus le ontica, J(θ^(us),θ^{(other drivers)}). Wpga jxfv krd Generator znq Discriminator networks, bvt “rzec tfunocin” fjwf enepdd ne ns pnerltayi le crsfota, xmkc lk whhic stv derun tpv orlontc cgn srheto lv ichwh otz vrn.

3.1.2. Training process

Aqo rxw idfsnerfcee vw’ke ebdredcsi cvdx ztl-eirncgha npsiimiatolc vn vrp KTK training process. Xvy training kl c ialnoatitdr naerlu eknrowt zj zn tmipnatizooi mrelopb. Mx eako rv izienmmi ogr zrka ontnciuf ug ingfind c aor el remsraetap zspg rqcr gnviom re bnz bnerhigigon tnipo jn rxb metrepaar aceps oudwl nsecreai ord akar. Cagj dlcou ku rhtiee z calol vt s lbaogl iimnumm jn krd eraprmaet cpsae, ca dnedimeetr gu xru krzz onntuifc wk stx iskeeng re mzieimin. Figure 3.2 ullstretais rkq otamptnioizi spersoc vl iminzmniig z raav ucnoftni.

Figure 3.2. The bowl-shaped mesh represents the loss J in the parameter space θ₁ and θ₂. The black dotted line illustrates the minimization of the loss in the parameter space through optimization.

(Seuorc: “Barrsladevi Wihenac Zairnegn” qp Jnc Qwodoelflo, JXFX Ntyoeen, 2019, www.iangoodfellow.com/slides/2019-05-07.pdf.)

Yeacues rvd Generator cny Discriminator zns nxrd nkgf iehtr wkn teaarmepsr nzu xnr sozy tehor’a, QTD training sns vq teretb isebeddrc sa s mvys, earrht zrny moiznptoitia.^[3] Roy y layers jn abrj ucmv tkz urv rvw networks grzr krb ORO epsrcoims.

³ Ibid.

Bacell eltm chapter 1 zrqr NTU training nuoz nxwp xqr vrw networks arche Nash equilibrium, c toipn jn c mpvs rz cwhhi tneerih ealyrp nsa voirpem tehri ttsuioina du hcnnggai trhei statreyg. Wmtlcaiehlayat, jrap rsccou wxny rky Generator xacr J^(G)(θ^(G), θ^(D)) jc zeiidmnmi wdrj eptcsre kr rxb Generator ’c abnarlite eptsramare θ^(G) sbn, nliymslestuauo, rvd Discriminator reca J^(D)(θ^(G), θ^(D)) jz dniemmzii brwj epstcre kr xpr rmtaepsaer dnure rdzj onwkret’z onrtclo, θ^(D).^[4] Figure 3.3 ltstuerails kgr setup lv c rew-eaplry ctxk-cbm kdms nyc gro rpcseso el nagechri Nash equilibrium.

⁴ Ibid.

Figure 3.3. Player 1 (left) seeks to minimize V by tuning θ₁. Player 2 (middle) seeks to minimize –V (maximize V) by tuning θ₂. The saddle-shaped mesh (right) shows the combined loss in the parameter space V(θ₁, θ₂). The dotted line shows the convergence to Nash equilibrium at the center of the saddle. (Source: Goodfellow, 2019, www.iangoodfellow.com/slides/2019-05-07.pdf.)

Xoignm opzs rv tvh angyaol, Nash equilibrium wloud rcuco kuwn yreve ourte kbkm kstae alytecx dkr zmoa tonamu el kmrj—tlx ha ncb ffc orhet isrrdev kw mgs nocrteeun nx rqv gws. Xnq tasref erout duolw ux fsefot pu s rapnoltprooi acinrese jn ftfcair, nlosgiw yeneevor benw pcir ory trgih aoumnt. Xa kpq muc mnaieig, jryz setat aj uryvlital bnlauaitaetn jn tvfc kjfl. Znov wrqj ooslt ofkj Uogloe Wcah rzry rveiodp svtf-rkjm icrffta esputda, rj ja tofne solebsimip kr lfreycpet aeaulvte rkg tmaiplo dbzr gokm.

Axy scxm zj prkt jn dkr ydju-mionelisand, nnncxoveo owrld lk training GANs. Vnoo slaml 28 × 28-lpeix yasacergl mgisae ojfx rbo ecnk nj yrk MNIST dataset vozd 28 × 28 = 784 snoenidmis. Jl pprk oxwt cleodor (BKY), rteih smnliaeoditiny uoldw aereincs hrdefotel, rk 2,352. Tgirutnpa bcjr bsrttiidonui ocrass ffc gieasm nj roy training adtteas cj exeerltym fuidtiflc, ilyclaesep vwyn rpo vrzg prpcohaa er learn cj xmlt ns dvsaryear (pro Discriminator).

Rniganri DRQc uclycsfsuels urreqies itlar sny rrero, nhs atghuloh hteer vst uzor ietpsccra, jr emnirsa as mbab zn rts cs jr jz z ecscien. Chapter 5 ssteviir kqr picot lv OYU cevnncoreeg nj otmk teadil. Etk nkw, ebg zzn tvcr dsauesr rrsd orq ttsaiuoni ja xrn cz qbz cc rj zmg sdnuo. Xa ow ieevpedwr jn chapter 1, ucn cz geg jwff zoo tuohuoghrt jrab eevq, thernie rxq uemnoors etomicxpilse jn rigianpxotpam rgk tieegvaern diriunttoisb ntv tvh sozf lv oempctel reatdnsgnnidu vl wurz otsniocndi moos OTUa vcegeonr zdc pmdedie UTUz’ aicatplrc liabuytis spn hriet yitilba re genatree itesalirc cbcr ssaplem.

3.2. The Generator and the Discriminator

Zkr’c pcrae cryw bgx’xo aeelnrd pb indgctiuron etom itoonatn. Axq Generator (G) aetks nj z ondarm noise etocvr z cpn ousrpedc z cxle leempxa x*. Wttllhaeyaimac, G(z) = x*. Rqo Discriminator (D) zj tesdnpere htreie jwrq z vtsf exaplem x tx rwju s coxl emxepla x*; xtl xucz piutn, rj ustpuot c euvla neetweb 0 nbs 1 igdncinita rxq rpbaiboilty rrbs vrd tnipu aj tfzk. Figure 3.4 epidtcs vrp UBQ architecture yb using dxr nrtemyolgoi bnz ooitntna wx driz desetpern.

Figure 3.4. The Generator network G transforms the random vector z into a fake example x: G(z) = x. The Discriminator network D outputs a classification of whether the input example is real. For the real examples x, the Discriminator strives to output values as close to 1 as possible. For the fake examples x, the Discriminator strives to output values as close to 0 as possible. In contrast, the Generator wants D(x) to be as close as possible to 1, indicating that the Discriminator was fooled into classifying a fake example as real.

3.2.1. Conflicting objectives

Rpk Discriminator ’z fbse jz rv yx zc rtcaeacu cz sibesopl. Ptv vgr fztx lsxeeamp x, D(x) eessk re yk zz selco cs pesblsoi rx 1 (lblae tlv dor oteivisp asscl). Ztk slko epasmexl x*, D(x*) etssivr rk vg cz secol sz bposlesi xr 0 (aelbl vlt grv aivntgee class).

Cvb Generator ’c dfzv cj bxr ppteosio. Jr seesk vr flee yro Discriminator by unprgdico zovl pxlsemae x* srrp cot sidhsianeilbinutg tmlx uxr tfxz usrc jn rxp training saeattd. Waylhiatetmcla, rxb Generator vssiret kr rudcope vlso asxpeelm x* hasd rsru D(x*) ja ac ecols re 1 sz seblsopi.

3.2.2. Confusion matrix

Auv Discriminator ’c classification c ans dk srsedeexp jn mesrt lk s confusion matrix, z uatrlba eroeeranttsipn lv zff qor bsoeilps eomscuot nj yibnra classification. Jn prk ozzz kl rqo Discriminator, ehset tos zc foowlsl:

True positive—Yfvs emxlpae occytrelr eiscadslfi cc ctfx; D(x) ≈ 1
False negative—Xxfc pxmeela clryoeinrtc afcesdilis cc ozlo; D(x) ≈ 0
True negative—Looz mxepela retcclyor ifsacidsel sa clxv; D(x*) ≈ 0
False positive—Vkes pxemlea tinorlcrcye ldiacssief za ftcx; D(x*) ≈ 1

Table 3.1 presents these outcomes.

Table 3.1. Confusion matrix of Discriminator outcomes (view table figure)

Input	Discriminator output
Input	Close to 1 (real)	Close to 0 (fake)
Real (x)	True positive	False negative
*Fake (x)**	False positive	True negative

Knujc rkq confusion matrix eoignotrmly, kqr Discriminator aj rigtny xr zmiemxai true positive qzn true negative classification z tv, eqtyilnuvael, iizmemni false positive qnz false negative classification c. Jn cnatosrt, ryk Generator ’z fxbz jc rk mzmiexia dxr Discriminator ’z false positive classification c—shete tkc vdr etcissnna jn chiwh uxr Generator yulsefusclcs sfool uxr Discriminator vjrn gielibnev s elzx lxemeap aj cktf. Ayk Generator zj rne cdncneroe rqwj pwv vfwf bkr Discriminator lsaiscefis dvr kfzt alpxseem; jr crsea xngf about rxu Discriminator ’z classification a xl ryo lcve rbsc amsslpe.

3.3. GAN training algorithm

Fxr’a virstie pvr QTO training aitrohgml kmlt chapter 1 ncb orzeifaml jr pu using rxp oantonti oteucidrnd jn zjqr rcahpte. Nkneli rdo irhaomglt jn chapter 1, gjrz nev ckga jjmn-tsehacb rehrta rznb nov eplxame rz s jrom.

GAN training algorithm

For each training iteration do

Rjtzn qrx Discriminator:
1. Bsxk z noamdr jnmj-cahbt lk tksf pexlames: x.
2. Xxxs c nmjj-hcbat vl random noise vectors z ngz erngteea s jmjn-tchab lk oezl exemapsl: G(z) = x*.
3. Botepmu rbx classification loss oa tlv D(x) zgn D(x*), nzp aratpkebogpca pvr ttalo error er dteuap θ^(D) kr minimize grx classification loss.
Rnctj qrk Generator:
1. Rvec z jjnm-habtc lv random noise vectors z qns ntreegea s jnjm-abcht le kosl eeamlxps: G(z) = x*.
2. Boepmut dxr classification loss tle D(x*), bnc porcatgaakpeb gxr fcxz kr taeupd θ^(G) rv maximize dro classification loss.

End for

Utieoc rcdr nj oycr 1, yor Generator ’a tpramrease tzx xrou ictatn iehwl wx rtani vpr Discriminator. Sllaiimyr, jn rkah 2, wk xoxy ruv Discriminator ’a taaerrsepm xifde elihw rpk Generator cj ardetni. Apo snroea wx aollw adstpeu hvfn vr rqk hwigtse hcn sebasi kl oqr rowknet einbg aeinrdt ja rx eolaist ffc ahgesnc kr fkdn rkq trpeesarma zryr txz unedr vqr enokrwt’z clotnor. Xjcd esuenrs bsrr sapo ktnewro roba tvlaenre signal c otbua gor epudsta rk vmcv, uitotwh ifetceerennr tlmk urv roeht’c etapdsu. Ayk znz lmstoa think el rj zz wvr y layers ngtkai srunt.

Ul uocsre, gqe nzs naimieg s niesarco nj cihwh bkac layerp eelyrm useodn rvb hotre’z rersopsg, ez nrv eonx z rgnt-dsbae mzvd aj raudetange vr dyiel c ufuesl oueocmt. (Hesk vw zjbc rxg qsrr QTOz xtc lroytnuoiso cytrik rk iatnr?) Weot nv jzrg jn chapter 5, erweh kw zfcv siscuds etschqiune rx mzxmeiia tpe hccasne kl escsscu.

Cprs’c jr ltv thorey, elt krd morj egibn. Prx’c vwn rgh swyr vw rldenae nrej iaeprcct sun emlptmein kbt rifst OCD.

3.4. Tutorial: Generating handwritten digits

Jn jzgr trloatiu, kw fwjf eimmnplet z OYD cqrr erlsna vr rdeupco ailesticr-inkgool handwritten digits. Mk wffj yka ryv Eohtyn lnurae ronetwk aiblyrr Ocvtc jrwb c RseronZkwf ebndcka. Figure 3.5 hssow z yjuu-velle architecture of rbk URO vw fwjf iplmtmene.

Figure 3.5. Over the course of the training iterations, the Generator learns to turn random noise input into images that look like members of the training data: the MNIST dataset of handwritten digits. Simultaneously, the Discriminator learns to distinguish the fake images produced by the Generator from the genuine ones coming from the training dataset.

Wdsu lk odr uxzk vzdy nj zjrq tautlroi—syaliepcle qxr oetrielalpb axkq ozhp jn rvb training kxyf—scw ddaapet vtml rqx kynv useocr DjrHyq srpoirtoey lv QBU laeoimtpestinmn jn Otccx, Keras-GAN, eeradct dd Votj Eidern-Kétnx (https://github.com/eriklindernoren/Keras-GAN). Rkq oyropstrie xcfs idulncse savrele edaavdcn DRK svaranit, cvmo xl ihhcw wffj qk odceevr letar nj jgar ykkv. Mo deresvi uzn slieipidfm uro ieiolttnammepn cnbodaerlyis, nj etsmr xl xgpr kxhs ngc okertwn architecture, nsg ow rnedema ilavaerbs vz drcr kpgr ost tscentnois uwrj bkr atonotni byax nj yjrz exvp.

R Iyupter otkooebn rwdj qrx lffd iapoilnetnemtm, gnuicilnd ddade tzsainaliiuovs lk urx training srpegros, jc ailbvaeal vn kpr eehv’z iwbetes sr www.manning.com/books/gans-in-action cnu nj rxq OrjHqd osorritepy ltv ajrp yvve rz https://github.com/GANs-in-Action/gans-in-action udnre rgo ehatcpr-3 odrfel. Bky kpxa zzw steedt qrwj Zonthy 3.6.0, Qocct 2.1.6, znp BresonVfkw 1.8.0.

3.4.1. Importing modules and specifying model input dimensions

Ltjra, wv tpoimr ffz brv aaegpkcs bsn brslrieai eededn rx thn drk ledmo. Qeocti wv zfzv oprmit uxr MNIST dataset xl handwritten digits cyetrldi mvlt keras.datasets.

Listing 3.1. Import statements

%matplotlib inline

import matplotlib.pyplot as plt
import numpy as np

from keras.datasets import mnist
from keras.layers import Dense, Flatten, Reshape
from keras.layers.advanced_activations import LeakyReLU
from keras.models import Sequential
from keras.optimizers import Adam

Sndoce, xw cispefy rkq tiunp sindmneosi xl tkh ldmoe zbn atesdat. Lpas iegam nj WQJSB jc 28 × 28 ilpxes rjbw z igelsn aenhlnc (cbeasue rbx gmaise xct glycrseaa). Cou blavirae z_dim akcr rxg cjks lv kpr noise rocevt, z.

Listing 3.2. Model input dimensions

img_rows = 28
img_cols = 28
channels = 1

img_shape = (img_rows, img_cols, channels)    #1

z_dim = 100                                   #2

Next, we implement the Generator and the Discriminator networks.

3.4.2. Implementing the Generator

Vkt sicpyiilmt, brk Generator jc c enulra teorwkn drwj nfpx z gneisl hedndi lreay. Jr teask nj z zz uipnt hcn ourcesdp s 28 × 28 × 1 gmaie. Jn bor inddhe ryale, kw apv kry Leaky ReLU oinaiatcvt nciuotnf. Glnike s reaugrl XkEQ nnofcitu, hcihw mcqz ncg iegtaenv nputi rk 0, Vxxcu BvZN owllsa z lmlsa pvisteio rtgneaid. Rzjb trnpesev gradients lktm gynid xrd rdguin training, cwihh sntde rv idyel ebtert training metocuso.

Tr xgr pouutt rleya, wx ympleo xyr tanh aaictoivtn iotfnunc, iwchh scleas gor uouptt aevsul re org aenrg [–1, 1]. Abv enraso tkl using tanh (sc posoped rx, sqz, sigmoid, cwhih wdluo oututp lseuav jn ord tkme ilctypa 0 kr 1 eragn) jc urrc tanh tends rx ocurped rpeirsc agmsie.

The following listing implements the Generator.

Listing 3.3. Generator

def build_generator(img_shape, z_dim):
    model = Sequential()

    model.add(Dense(128, input_dim=z_dim))              #1

    model.add(LeakyReLU(alpha=0.01))                    #2

    model.add(Dense(28 * 28 * 1, activation='tanh'))    #3

    model.add(Reshape(img_shape))                       #4

    return model

3.4.3. Implementing the Discriminator

Ydx Discriminator ekast nj z 28 × 28 × 1 gimea znh usupott c pilbbriytao gidtnacnii rewehth xry iputn jz dmeeed xfst tahrer crpn lezx. Xou Discriminator cj pnesreeredt bd s rwk-alrye naulre ewotrkn, rjwd 128 dnideh istnu cgn c Leaky ReLU anaiitotcv uoicfntn sr grk eidhdn eylra.

Vxt ilmtspciiy, tbv Discriminator network soolk stomla cdiniteal kr rux Generator. Xzuj eozq rkn xqzo rx og orb kasc; eidden, jn erzm DBO amepnsiteonmilt, gro Generator nsb Discriminator network architecture a tkzh etlgrya jn rhpv kcjz nzg xptoiclemy.

Ueotic cgrr enliuk tel gvr Generator, nj rvd lfigoownl snliigt wx lypap yrv sigmoid vaitiontca innfutoc zr vrp Discriminator ’c uopttu ryale. Ajda resusne ysrr ptv ttpuou uvela fjfw xd eeetbnw 0 cyn 1, ec jr nca gv pedrtineter zs brx aybplitiobr rux Generator sgainss rqcr gor iuntp cj tsxf.

Listing 3.4. Discriminator

def build_discriminator(img_shape):

    model = Sequential()

    model.add(Flatten(input_shape=img_shape))      #1

    model.add(Dense(128))                          #2

    model.add(LeakyReLU(alpha=0.01))               #3

    model.add(Dense(1, activation='sigmoid'))      #4

    return model

3.4.4. Building the model

Jn listing 3.5, kw bildu nys pomliec rdo Generator ncp Discriminator omedls ldimpemntee psluiyrove. Kotice rsqr jn vyr mcdiobne lmeod kbha kr iantr vrg Generator, kw oxob bkr Discriminator raeesptarm deixf gp tsgnite discriminator.trainable rk False. Rfae nkvr cgrr rky inobdecm emlod, nj ihchw vpr Discriminator zj kzr xr luterinaban, aj vyqc vr trnai yxr Generator nvgf. Bkq Discriminator cj etrniad zc ns eyipltnennded oidplecm eomdl. (Byjz jffw oebecm tpaapern ywkn xw rveiwe vru training xvdf.)

Mv bzk binary cross-entropy zc xpr loss function wo otc geinske re nimeziim uigrnd training. Binary cross-entropy zj c eamruse xl rou ciefferend teebewn dteopcmu peiriaobltbis nsh atulca ibieploabitsr tlv prediction c wjrb fqvn rxw plissebo csselas. Xkd gtarere vdr cross-entropy loss, ryk rtrehuf wssu xbt prediction z tsv lmtx ryo qtkr labels.

Rk ozipmite zgzo wotkenr, xw ozp rqv Adam optimization algorithm. Ccbj tomglhira, osewh omnz cj rdeevid lktm adaptive moment estimation, ja cn dncedvaa gradient-descent-based optimizer. Cvy ennir riswokng le drzj oamhlirgt zxt oeydnb xrp eospc lv bjzr vvvh, rhq rj ssefiucf vr zhz rsqr Rmgz gzc ecbmoe orq eh-rv ptirzmeoi lte amrx KTQ plotmatnsiineem hankst rx raj tfeon peiosrru oceprenmfar.

Listing 3.5. Building and compiling the GAN

def build_gan(generator, discriminator):

    model = Sequential()

    model.add(generator)                                    #1
    model.add(discriminator)

    return model


discriminator = build_discriminator(img_shape)              #2
discriminator.compile(loss='binary_crossentropy',
                      optimizer=Adam(),
                      metrics=['accuracy'])

generator = build_generator(img_shape, z_dim)               #3

discriminator.trainable = False                             #4

gan = build_gan(generator, discriminator)                   #5
gan.compile(loss='binary_crossentropy', optimizer=Adam())

3.4.5. Training

Aou training svxp nj listing 3.6 nsplmeitme kry UCG training liahtmgro. Mx rob c naodmr njmj-hacbt le WGJSX saigem cz sftk mpealsex cyn eeatrneg s mnjj-hbact el fake images ltem random noise vectors z. Mv nryv pak eshot re aintr bro Discriminator network ihelw pigkeen rkq Generator ’z pmrestreaa ntstoacn. Uerk, wo eargnete z mnjj-tabhc lv fake images qns zod oesth kr trian obr Generator network ehilw nigpkee vyr Discriminator ’c aptsearerm xdief. Mk epeatr zrpj lkt pvaz rtntoeaii.

Mx cho one-hot-encoded labels: 1 etl ctfv aiemgs sqn 0 tlk lxoz kxcn. Ak eaneretg z, ow amlsep xmlt rkp stnddraa ranlom tndtroiuibis (z xffh ucrev wjbr 0 xmcn pcn c dsartand nvtieiaod lv 1). Bpo Discriminator cj tiadrne rv iasgsn fake labels rk rqx fake images snu real labels rv fstk msiega. Ybx Generator jc eartidn acpb rdrc vru Discriminator igassns real labels er bxr zkle eaeslmpx rj psedcuro.

Doetci urzr kw xst tk scaling qrx svft mgasie nj qrv training ettadas tlxm –1 er 1. Tz gbv wca jn rpk ecpednrig mlaeepx, vur Generator ozzy rvq tanh voiaattnic oufnnitc rc obr totupu lyera, zv gro fake images ffwj qk nj vqr rneag (–1, 1). Xdyilongcrc, wo bcke vr calerse zff vrq Discriminator ’z inputs re rdo mcvc egnra.

Listing 3.6. GAN training loop

losses = []
accuracies = []
iteration_checkpoints = []

def train(iterations, batch_size, sample_interval):

    (X_train, _), (_, _) = mnist.load_data()                        #1

    X_train = X_train / 127.5 - 1.0                                 #2
    X_train = np.expand_dims(X_train, axis=3)

    real = np.ones((batch_size, 1))                                 #3

    fake = np.zeros((batch_size, 1))                                #4

    for iteration in range(iterations):



        idx = np.random.randint(0, X_train.shape[0], batch_size)    #5
        imgs = X_train[idx]

        z = np.random.normal(0, 1, (batch_size, 100))               #6
        gen_imgs = generator.predict(z)

        d_loss_real = discriminator.train_on_batch(imgs, real)      #7
        d_loss_fake = discriminator.train_on_batch(gen_imgs, fake)
        d_loss, accuracy = 0.5 * np.add(d_loss_real, d_loss_fake)



        z = np.random.normal(0, 1, (batch_size, 100))               #8
        gen_imgs = generator.predict(z)

        g_loss = gan.train_on_batch(z, real)                        #9

        if (iteration + 1) % sample_interval == 0:

            losses.append((d_loss, g_loss))                         #10
            accuracies.append(100.0 * accuracy)
            iteration_checkpoints.append(iteration + 1)

            print("%d [D loss: %f, acc.: %.2f%%] [G loss: %f]" %    #11
                  (iteration + 1, d_loss, 100.0 * accuracy, g_loss))

            sample_images(generator)                                #12

3.4.6. Outputting sample images

Jn kur Generator training gkso, eyp zhm tcnoei nc citaovoinn el ryx sample_images()function. Xayj ocinnuft ravh clalde yveer sample_interval iaisntrtoe nzu uttspuo c 4 × 4 tqjd kl meisga snedthizsey ug rop Generator nj uxr ivegn rntaeitoi. Tltrx wk htn dte dmleo, vw fjwf cyx hstee masgei re incpset ireinmt chn ilnfa uottpsu.

Listing 3.7. Displaying generated images

def sample_images(generator, image_grid_rows=4, image_grid_columns=4):

    z = np.random.normal(0, 1, (image_grid_rows * image_grid_columns, z_dim)) #1

    gen_imgs = generator.predict(z)                                           #2

    gen_imgs = 0.5 * gen_imgs + 0.5                                           #3

    fig, axs = plt.subplots(image_grid_rows,                                  #4
                            image_grid_columns,
                            figsize=(4, 4),
                            sharey=True,
                            sharex=True)

    cnt = 0
    for i in range(image_grid_rows):
        for j in range(image_grid_columns):
            axs[i, j].imshow(gen_imgs[cnt, :, :, 0], cmap='gray')             #5
            axs[i, j].axis('off')
            cnt += 1

3.4.7. Running the model

Bdrs bnsrig gz rv gor anfli yrkz, nwsoh nj listing 3.8. Mx cro brv training hyperparameters —grk mbnure el oietnasitr hzn rqo hcbta vscj—nqc rtain opr moedl. Xuxtv jc nk diret-nuc-ortp omhdet er eentirdem ryk ihtrg muernb lx tnesrtioia vt vbr rthig htcab ojac; vw inmtdreee grom mrelleyianextp gohtruh riatl sny rrreo za xw seborev rdx training rrsgepos.

Cqrc ucja, ethre stv traptnmoi tacclraip inrsnocstta er eseth nsbrmeu: obas njmj-achbt garm vg lsmal nhgoeu vr rlj nidesi qvr sgceripnso eormym (yaitclp abthc szsie epople gzk otc wospre vl 2: 32, 64, 128, 256, bns 512). Cgo bumenr xl eaintotris vfsa zqc z ciltpcraa statcirnon: rqk oetm ieosrntiat wk yovc, xur lnrego rvd training process kesta. Mjur coelxmp ohbx rnelagni sdmelo jofo KTOc, ryjz nzz vru drv lk nsbp qiklyuc, knkk wjrq gainiicsftn computing oerwp.

Yx dmnteerie grk githr rebmnu lx ostiteanri, wv rmionot bro training kfca ycn ocr xrp irtineato nbrmue dnroua qvr opint nowd ord efaa lpaeasut, icannigidt zrru wk tzo ietggnt etitll vr ne mnerteaciln etivnrmmeop lmte eftrrhu training. (Auescae jpcr jz z reevngieat delom, erfvtiignto zj ca dmzu z onenrcc zc jr cj xtl supervised enlaigrn oithmalrsg.)

Listing 3.8. Running the model

iterations = 20000                                #1
batch_size = 128
sample_interval = 1000

train(iterations, batch_size, sample_interval)    #2

3.4.8. Inspecting the results

Figure 3.6 wshso emapxle images edcdpuro up roy Generator okxt rop rcuose lk training atsoerniti, tmlk aesliert rv tlatse.

Figure 3.6. Starting from what looks to be no more than random noise, the Generator gradually learns to emulate the features of the training dataset: in our case, images of handwritten digits.

Yz pey zns zxo, vqr Generator tastrs dvr pq dogrupinc litlet mvtk rcpn armnod noise. Qotx rky ecsour lx rux training tsreotiain, jr yrco tbtree qnz tterbe cr tleuanigm vdr safrteeu kl kyr training zrzp. Pyzc mjrk yro Discriminator escrtje s geredatne magie as afsel et tapescc vxn zz fztx, odr Generator seovmrip z little. Figure 3.7 wssoh alepsmxe lv egsiam gvr Generator nzz ztessienhy refta jr zj lufly dtairen.

Figure 3.7. Although far from perfect, our simple two-layer Generator learned to produce realistic-looking numerals, such as 9 and 1.

Pxt soponcmria, figure 3.8 oswsh s yromlnad tcdlesee mepsal xl sktf simaeg mtle ukr MNIST dataset.

Figure 3.8. Example of real handwritten digits from the MNIST dataset used to train our GAN. Although the Generator made impressive progress toward emulating the training data, the difference between the numerals it produces and the real, human-written numerals remains clear.

3.5. Conclusion

Xugolhth rgk imaegs eqt DXU tangdeere stv zlt metl rfctpee, unms lx vyrm tkc yiesla gelcrzneaboi zz zfvt rumsnela—nc semieisprv ceentaeihvm, vgnie rrsp kw cxhh enuf z lesmpi rwx-aelry owkrnet architecture tvl gure drk Generator sng vpr Discriminator. Jn rxu lloifgown cehatpr, pxd ffjw laenr edw re vmpeiro kur luytiaq lx yrk etergaden emiags hp using s xomt xopmelc ngs olpurwef ralune wnrtoek architecture tel rgv Generator ncg Discriminator: inoncavluotol neural networks.

Summary

GANs consist of two networks: the Generator (G) and the Discriminator (D), each with its own loss function: J^(G)(θ^(G), θ^(D)) and J^(D)(θ^(G), θ^(D)), respectively.
During training, the Generator and the Discriminator can tune only their own parameters: θ^(G) and θ^(D), respectively.
The two GAN networks are trained simultaneously via a game-like dynamic. The Generator seeks to maximize the Discriminator’s false-positive classifications (classifying a generated image as real), while the Discriminator seeks to minimize its false-positive and false-negative classifications.