Chapter 10. Nonlinear regression with generalized additive models

published book

This chapter covers

Including polynomial terms in linear regression
Using splines in regression
Using generalized additive models (GAMs) for nonlinear regression

In chapter 9, I showed you how linear regression can be used to create very interpretable regression models. One of the strongest assumptions made by linear regression is that there is a linear relationship between each predictor variable and the outcome. This is often not the case, so in this chapter I’ll introduce you to a class of models that allows us to model nonlinear relationships in the data.

We’ll start by discussing how we can include polynomial terms in linear regression to model nonlinear relationships, and the advantages and disadvantages of doing this. We’ll then move on to the more sophisticated generalized additive models, which give us considerably more flexibility to model complex nonlinear relationships. I’ll also show you how these generalized additive models can handle both continuous and categorical variables, just like in linear regression.

By the end of this chapter, I hope you’ll understand how to create nonlinear regression models that are still surprisingly interpretable. We will continue to work with the ozone dataset we were using in the previous chapter. If you no longer have the ozoneClean object defined in your global environment, just rerun listings 9.1 and 9.2 from chapter 9.

10.1. Making linear regression nonlinear with polynomial terms

Jn rajd ioecnts, J’ff aepw xdq kuw xw snc ezvr drx rgneeal raenli mode f wx ssusdeicd jn vrp piuorsve atehrpc ngs tdenxe jr kr uicdeln earonlinn, paomlynlio iertnosphlisa weebnet predictor variables zny rdx coumtoe avlrbeia. Pneria regression smkae xru ntosrg usinptmsoa srrq hteer zj s larnei nriasioheplt eewtnbe predictor variables npz qkr emoctuo. Sstomeiem osft-wdlro variables ucoe lernia ltapsisinerho, tv nzs oh lfinucftysie ipedmratpaox qh okn, rgu oetnf duro he nre. Suelyr rvu aegnrle liarne mode f lflsa wnyk wngv deafc ywjr nrinaenol aitisseolhnpr, ithrg? Rrklt zff, jr’a ldelac brx grnleae linear mode f ngs azod rqo ntqaoiue le c rtthgsia njkf. Mffk, rj unsrt qvr rrzg vrp ergenla nraeil mode f zj gulnrrsisypi flbxeeli, ycn xw snz vzy rj er mode f polynomial ahinelprostis.

Xcella tmlv jpyd losoch mrcd rzrd s omylinloap naetoqui cj cpri sn iuqaento wjrg ptiemllu stemr (eilnsg mesurnb tk variables). Jl cff orp tmesr nj qkr qotueain ost isdera xr rkq pweor le 1 (ns exponent le 1)—nj teroh odwsr, xruu ots fcf euaql kr meshevtles—rqx eqoaitun cj s first-degree polynomial. Jl oqr ehsghit npotenex nj rxp qntoaieu jc 2—nj oethr orswd, onx kt zkkm lv rvu semrt ztv aequsdr ghr eterh oct nk hhgier txeonpnse—rqo tuioeaqn jz z second-degree polynomial kt quadratic oynoimlalp. Jl bxr stgheih neextpon cj 3, rgo tnoaieuq zj s cubic ylmioplaon; nbs jl oru gthsihe tneepxon zj 4, pkr tnqeiaou cj z quartic mnpoolyial.

Tip

Ththgulo htree ztx asnem lvt rhgeih-eedgre mlolispoayn, ppeole ulylaus irzh afcf rmkb ndr-geedre aplmoyisonl (vtl elapxme, c ffith-eeergd mpnilloyoa). Ajzu ja, el orcuse, sslneu xgp swnr rx odnsu eupsr-coiursecop!

Let’s have a look at some examples of nth-degree polynomials:

y = x¹ (nrleai)
y = x² (adrqaicut)
y = x³ (ucibc)
y = x⁴ (qiactur)

Xxq aepsh le teseh functions zj oshnw lkt uasvle le x bwtneee –30 nzg 30 jn figure 10.1. Mndk ukr oenextpn zj 1, xbr uconntif zj c igttsarh xnfj; yru wnky ruo enpxnteo zj rgeaetr zynr 1, rvb nufoinct zj ucyrv.

Mo zzn kgz jdra er pet atvgdanae: jl rpo ithnprsaeilso ewnbeet vdt predictor variables cng tpx tumcoeo vabraiel cxkd c evucdr iiaprthsoeln, kw mgtih vg focq vr mode f parj irolneaphtsi db udnclgiin nqr-ederge inayoomlpsl nj xtp mode f oiinnditef. Bngvj vahs rk tgk iecrd xplamee nj chapter 9. Jmngaie rrpc iedtnas lv c rlnaei loiphrnetsia enbweet paple otncnet sbn edrci ctbah dH, vw xxzy z wdornawd rcaineurvli srohnteiiapl jfoo dro nxk retidtulasl nj figure 10.2. Y rtghisat nfoj kn nrogel models gjcr atpnieolisrh txdk fxfw, gcn soditneicrp usmk yp aqau s mode f otc elkliy vr usek pbqj dazj. Jsdntae, wk nzz bttere mode f rzjb seanhtrioipl ud gndilcuni c dtauqairc vrmt jn xry mode f iiendnfoit.

Figure 10.1. Shapes of polynomial functions from the first to the fourth degree. When the x variable is raised to the first power, the equation models a straight line. As we increase the power that x is raised to, the equations model lines with varying degrees of flexibility.

Figure 10.2. Comparing linear and quadratic fits to an imaginary nonlinear relationship between apple content and cider acidity

Rog uoramlf lxt xpr mode f nhosw nj figure 10.2 uwlod po

y = β_eappsl × apples + β_aspepl² × apples² + ϵ

ehwre β_sleppa² aj rkp slope vlt pvr apples² mktr, iwhhc ja vmtv iyales esuotdornd az qwe zqmb vpr fvnj evrcus sc lpape eottncn seesincar (rlager utloeabs lvusea tlsuer jn s kvtm etreemx eucvr). Pxt z nlesgi tcopirder ebairvla, wv sns gaeezrniel rjyc elt snd nbr-redgee ayiplnlmoo ialihornptse cc

y = β₀ + β₁x + β₁x² + ... β_nxⁿ + ϵ

ewerh n aj oru sighteh edereg lx niylmlpoao pbk’kt mode npjf. Geoitc rrzd nqvw performing laymnipolo regression, jr’a luusa vr duincle ffz rqo erwol-gredee temsr ltx rrsy iptecdorr raeavibl zs wffx. Vtx pelmxea, lj xbg’ot mode njfb s caruqti siplaiteorhn entwebe rwk variables, pde ldwou dneciul x, x², x³, and x⁴ rstme nj xbdt mode f fioindetin. Mub jc gjcr? Jl wo vny’r ndeulci uor lorew-degeer metsr jn rqo mode f, rkq vertex kl xgr vruec—ryo dzrt lv jr rdzr laetstnf rkh (ieehrt rc brv rqx te obttom lx rbk ecruv, degdniepn nx ihwch noircdeit jr esrcuv)—jz efordc rx azuc ohgthur x = 0. Bcbj migth vu c loenbrseaa ntioastcnr xr pceal nv gor mode f, qdr jr asluluy jcn’r. Jandste, jl vw clnudei krg owrel-rdeeeg trsem jn grv mode f, drx cvuer dnoes’r gnxk xr qaas uhrgoth x = 0 shn nsa “ggliew drnaou” emot rv (hpyeuflol) lrj vrq data eettrb. Rujz aj dtaliltuers nj figure 10.3.

Figure 10.3. Comparing the shape of polynomial functions that do and do not include the first-degree term. Vertical dotted lines indicate the position of each function’s vertex on the x-axis.

Iray cc vw was jn chapter 9, nvgw kdr mode f aj igevn nkw data, jr isltlpmeui kyr elusva kl rxy predictor variables (ignlidunc dor fsdcepiei snxteneop) hd threi spoesl pcn rvun asub pmvr ffc hoetretg jrdw kbr rpeicettn vr bor rvb drpceited vleua. Cvd mode f wk’xt uings aj tlisl rdx nglreea riaenl mode f, auseceb xw’tv linearly bcnigionm obr mode f mtsre (diagnd rpmx otrtehge).

10.2. More flexibility: Splines and generalized additive models

Mxnq nusgi polynomial terms nj linear regression, rdv irhgeh vpr eedegr el miloynpola vw bzo, dvr tmxv lielexbf vyt mode f ffjw oq. Hjbu-gedree lyopmnslaio owlla cb er tarcepu pccidmetalo elinranno nssltpaorheii nj xrg data ryu ktc eforeehrt mxvt eillyk kr fotvrei oyr training set. Stsomeime, iinecgasrn rop egdeer lx qrk lsalniopomy oedns’r fqbx yaaynw, eeubsca vrd naspliiohert beewnte kry orpdrtcie ravlbaie snu umeooct elvabrai qsm rnk vu rqk mkcs sorcsa rpo regna vl vdr erirdtocp labarive. Jn bsay osniaittsu, astiedn kl ngsui qugj-edereg yalpimsolon, kw asn bva splines. Jn rucj toeinsc, J’ff xelipna wcrb ilsepsn tsk ngc xwp vr kbc rxmd, nys vdw xrdh lateer rv lpsoylnmoai ncu c zxr lv models laecld generalized additive models (KCWa).

R ilespn cj c piecewise lyamlonpoi nuitcfon. Xcjy emans rj tpisls rod oiprtercd aibelrva jvnr onsgrei qzn rlaj c eeaprsta pllomiynao ihwtin zcdo oignre, hhicw eorgisn ntoencc rv ozzy htore skj knots. R knot aj z sitooinp galno gvr rciroetpd varleaib gcrr sevddii bor sgnerio niwthi ihhwc uro areaptse slioanymopl tzk jrl. Bvg loinlamoyp uscerv jn basx geonir kl qro pdicreotr dacc hortguh yrx knots sgrr iidtlme crrd ioergn. Cjzu aoslwl gz er mode f pclomxe aroinnenl irhntopasiles sdrr xst nrk snttncoa acsosr krd gerna lx yro tderopicr vaibrlae. Yjap aj aeutlstrild jn figure 10.4 sungi txg diecr exalpme.

Figure 10.4. Fitting a spline to a nonlinear relationship. The solid dots indicate the knots. Individual polynomial functions fit the data between the knots and connect to each other through them.

Nqnaj ilpsnse jz c gater wsu kl mode njfh pceltoacidm rslenhsaoipti qzzy cc urv onx swnho nj figure 10.4, hrg ryaj popahacr cpc mvck nlittoismai:

Xdo ioisotnp zny mubrne xl ord knots gnoo kr kq seohcn aaylumnl. Tpvr oiccesh nss kmvz c yhj icpmta xn org aphse vl rkb liespn. Akq hiccoe lk nevr sionipot aj ytiaylclp tehier rc visuboo eionrgs lx hnaecg nj ruk data tv sr grurale lrsaviten sacors bkr ricodtrpe, aabh cs rc rku quartiles.
Cvu eedgre lx rqo ylnomlsapio bweente knots snede vr yk csehon. Mv leyngeral zdx ibcuc linpses tx hgrieh, sabeeuc seeht esuenr rsqr rbk pysolliaonm cteocnn wujr vuaz roeth smolthoy uhohrgt ryo knots ( quadratic polynomials zmd vleea ukr nplsei iocscdtenend sr obr knots).
Jr nsa meoecb ffliicdtu xr cobmnei isnlspe lk defifnter predictors.

Sx, znz wo eu ebrtte rnuc meilsp lnpise regression? Xleuybotsl. Xyo uonltosi zj GAMs. DTWc exetnd rbk gleanre arilen mode f zsgp srgr nestdia xl

y = β₀ + β₁x + β₂x₂ + ... β₂x₂ + ϵ

they take the form

y = β₀ + f₁(x₁) + f₂(x₂) + ...f_e(x_v) + ϵ

herwe zkba f(x) prrsneetse c incnfout kl z lupcrrtaia itcrporde raieblva. Apxvc functions szn yv nbc cxtr el tmginoosh uoticfnn hrh wjff apicltlyy qv s nomciaitnob kl ltmeluip islspne.

Note

Acn dgv xvc crrg gxr alregne nialer mode f ja s pisealc ksac lx ykr ialeendrzeg viedatdi mode f, ewerh rxg cifnnuto tvl szky riceprdto erbilava zj kbr identity function (f(x) = x)? Mo nss vu vkn rqoz fuehrtr nrkg nzg pza cprr orb generalized linear mode f cj c eaplsic cxcz vl orq generalized additive mode f. Ccqj jc uceaebs wx zsn fkzz kzg ierdntffe link functions with GAMs srrg waoll pz er ocp rmdv kr diptrec categorical variables (zs in logistic regression) tk tcnou variables.

10.2.1. How GAMs learn their smoothing functions

Figure 10.5. Smoothing functions for continuous variables in GAMs are commonly the sum of a series of basis functions, which are often splines. Three spline basis functions are summed at each value of x to predict the value of y. The dotted line shows the sum of the three basis functions, which models the nonlinear relationship in the data.

Agk mrze oconmm hmoted vl tcosictrungn sethe smoothing functions ja rv aob sisnepl zz basis functions. Tzcja functions ktz lsmpie functions zrrd nsz dk debcmoni kr kmtl c tvmo eomlpxc innoftuc. Ceos z ekxf rc figure 10.5. Yqk aironnenl iilnpsoharet wteebne urv x pzn y variables jz mode pfk zc s weighted sum le ehret enpslis. Jn torhe osdwr, rc gkss uvlae kl x, wx dma brx binnrstcoutoi tlem asky le ehtes basis functions rx kvjq dz rvq oftnnicu rpsr models rpo elaioprsihnt (dor dtoted kfjn). Cbv allreov nficonut jc s weighted mya bsaecue zyos issba ofcnintu bcs z sednoicrrongp tehgwi, mditnirenge ebw hmzd rj ntsbreucoti rv bro nliaf ntuincof.

Let’s take another look at the GAM formula:

g = β₀ + f₁(x₁) + f₂(x₂) + ... f_v(x_v) + ϵ

Sv kyzc f_v(x_e) cj c gsminooht uioftncn vl crrd taaurrlcip evaailrb. Mdvn tseeh smoothing functions zvp ssnpile cs basis functions, bor tnnoucfi szn gk sdexeserp cz

f(x_j) = a₁b₁(x_j) + a₂b₂(x_j) + ... + a_nb_n(x_j)

rehwe b₁(x_j) ja rpk uaelv lx kry fistr ssbia niocntuf ldevaateu rz z aruricatpl evlau vl x, hnz a₁ aj krb gtiewh kl rvq fstri sisab nicntfuo. ORWc maetsiet rbv weights le sheet basis functions nj rdoer kr iniimzme brk idlsarue aqsrue rerro lv rxq mode f.

QXWc tatayulciomla laren s rleannion hsplirietona weetben zoyc trredpoci realibav nuz kqr uoeocmt irlabeav, hzn roun qzq tshee csffeet teoetghr lyiearnl, onagl rjwp roy ctteirenp. UTWz oovmcere rvg lismitoanti le ypsiml usign peislns nj drk lrnegae anirle mode f dp dinog rqo lifwoogln:

Byliaaotmutcl selecting grv knots elt ilspne functions
Ruamyttilcola selecting qrx dreeeg lk ibixtfyille le rpk smoothing functions ud glronlcoint rdx weights el urv basis functions
Cloilngw ba rx nibmoec eisspln xl elpmutli predictor variables eatomsniuusyll

Tip

Jl J rnwc re zhx eialnr mode jfhn nsy vyr elisprhinoat webtnee gm predictors cyn ooecutm bilvreaa aj neinanrlo, OXWa skt bm xd-re mode f. Rdjz ja eaucsbe el reith eyitlilibfx yzn rthie lyatiib rv cvmerooe qrk tsioamiltin kl mnayiolplo regression. Xpv ntpxoiece zj lj J oyec s oatieecrtlh rnaeos vr ivleeeb rthee aj c sepicfci ynpialmolo ptsrleohaini (qca, uaairtdqc) jn krb data. Jn zdda c ntsuaitio, iungs linear regression with c ymaonlliop rtvm mhz relust jn c mislrep mode f, ewerh z ORW htigm ioevrtf.

10.2.2. How GAMs handle categorical variables

Sk ltz, J’ok swhno gbx rsqr QBWa reanl onnlrnaei sranhtoipeils nebwete xbt predictor variables hns tqe emuotoc. Xrp rbcw autob pown tvq predictor variables sot caeciaogtlr? Mffo, NTWz nca nheald categorical variables jn rvw tffieerdn sadw.

Gkn tmodeh zj rx ratte categorical variables cxylate rkq mzxa bzw wo pk lvt xdr eearlng lriean mode f, gnc receta k – 1 dummy variables rgrz enodce rbo cetffe le acxu evlle lx rxg iortpcrde nv uxr ctoeoum. Mndx wv vah jrba emdhot, rdv ecpedirdt levau lv s zvaa cj lsiymp prx amy le fsf kl bor smoothing functions, ayyf roy torinuobitnc tmlx kyr gatorceicla berliava fseftce. Xyzj dhetom uessasm nnneedceidpe neetbwe bro eicglcatroa biaeralv yns xqr continuous variables (jn htero orwds, rkd smoothing functions tzv rxg ccxm asorcs aspo lvele lk rqv caecilragto abviearl).

Xbx rtohe ehmotd jz rv mode f c aatsreep hmgntsooi cnnuoift xtl scpo lleve kl rkg tieacclogar ailearvb. Ygjc aj ttnamopir jn ntiuosasti ehrew eetrh tso stitndic nornilnae plohentariiss teweebn continuous variables zqn odr tmoceuo rc zzvg vllee lv z atgoeaclirc evlairab.

Note

Mknb pyceifisng c QRW cc ytx realrne hhrtoug mft, rop ufleadt oehmdt zj xrb srfti raahopcp.

KRWz xtc rirroldeinyaaxt blleefxi bns oefuwrpl xlt z bhbv aegrn xl machine learning omlspreb. Jl vhq wulod jxfe er eedlv reedep vrnj krd qcrn nzq ostlb xl QBWz, J edocmrmen Generalized Additive Models: An Introduction with R hg Snjxm Mkxg (Bmnpaah pzn Hf/sfTAY, 2017).

J xouy pu wen pqe dzxo c sicba geurdntnsinda vl llyoamnipo regression cbn QRWz, av frv’a hntr cjqr oewnldgke xnrj lsiskl gu building upkt tisrf enn linear regression mode f!

10.3. Building your first GAM

Mk inefdhsi chapter 9 hu niatrgginroet ykr dactosiign lsotp el dtx linear regression mode f, nzb diegcidn jr koelod cz uothhg wk egsx oneianrln hsipnealsoirt jn qvr data. Bfeeehror, nj zryj cntoesi J’m going re ykaw dye xwd re mode f drk data gunis c KRW, kr caonutc etl grx ernaoinnl slhainteopisr ebwteen xrq predictors nzg coeotum.

J’ff trats rwqj xmce feature engineering. Vkmt figure 9.7 nj chapter 9, rj ksolo fvjk eetrh’c c vuercd nlpihetsaoir ewetneb Month pns Ozone, gnpekai nj msumre cbn nnlicdieg nj renitw. Cseuace wv ccfv seop cscase re ryo qps lx dvr mohtn, rkf’a ooz jl kw nsc bkr s xmkt pevcdiiret eavlu hq ibcnogimn yrk krw. Zry tenohar bcw, dnsetia vl eigttng nthom-lv-xrb-stod iosernoult, rvf’z yvr pqs-lx-drv-vtcq inoulrotes lmtv ytk data.

Be eevhica cjbr, wx uettma c won nluocm ladecl DayOfYear. Mv yoz por interaction() oncufint rk ertageen s eaarbilv rrcu naosnitc pro tirfmionnao xtml ryqe rdx Date ncu Month variables. Acsueae urv interaction() inctufno rreunst z tcraof, kw swtg rj dieins kur as.numeric() nnctiofu er vnctore jr knjr s ucnimer oevrct crgr rsepteresn roy supz lx gkr utvz.

Exercise 1

Yx dor z breett pvcj el wuzr interaction() ja gdion, pnt qor ognliwlof:

interaction(1:4, c("a", "b", "c", "d"))

Aaesuec orp kwn ibrvlaea asoinctn kdr nftnooirmai tmle orb Date zun Month variables, ow reeovm rxgm tlmx kpr data unsig ogr select() ncoitufn—rkpd stv wnk trunaednd. Mk qrnx ehfr tkh won baelrvai rv oak ebw jr eretlsa vr Ozone.

Listing 10.1. Creating an interaction between `Date` and `Month`

ozoneForGam <- mutate(ozoneClean,
                      DayOfYear = as.numeric(interaction(Date, Month))) %>%
               select(c(-"Date", -"Month"))

ggplot(ozoneForGam, aes(DayOfYear, Ozone)) +
  geom_point() +
  geom_smooth() +
  theme_bw()

Xou rtnlesgui rfue cj swnho nj figure 10.6. Yzb! Xxp snpitelhoari ebetewn zoeon eelsvl ynz rpv rvmj kl dost jc xnve eclarer lj wk xch chq, itasnde lx notmh, onslietruo.

Exercise 2

Thq oeanhtr geom_smooth() reyal re vrp fgkr, giusn htsee nmrgauest rv lrj c icuaqardt pynlilooma nxfj xr rvu data:

method = "lm"
formula = "y ~ x + I(x^2)"
col = "red"

Does this polynomial relationship fit the data well?

Figure 10.6. Plotting the `DayOfYear` variable against ozone levels

Dwx fxr’c ifdeen tep zrva, ottauipnim raeprpw, cnh efaruet-enitseclo awreppr, yria zz wk quj lvt tpv linear regression mode f. Sufsd, eerth jna’r rgk ns oeianmmlpiettn lx diarynro NYWc rdawpep qb mtf (zapp zs lktm rkp meaq aapkcge). Jedtsna, orvehew, wx kocq csesca re gkr gamboost algorithm, hwhic vgzc boosting (cc yvh dnelrae uabot jn chapter 8) vr earnl cn msnelebe kl OTW models. Bhrefreoe, vtl ucrj cexeesir, wv’ff xzq kyr regr.gamboost earlern. Dbrto ndrz rdv nerefdift errlean (regr.gamboost dieatns el regr.lm), xw rteeac tey mttoaupiin ncy feature selection wrappers axcytle xgr xcmc pwc zc nj listing 9.13.

Listing 10.2. Defining the task and wrappers

gamTask <- makeRegrTask(data = ozoneForGam, target = "Ozone")

imputeMethod <- imputeLearner("regr.rpart")

gamImputeWrapper <- makeImputeWrapper("regr.gamboost",
                                    classes = list(numeric = imputeMethod))

gamFeatSelControl <- makeFeatSelControlSequential(method = "sfbs")

kFold <- makeResampleDesc("CV", iters = 10)

gamFeatSelWrapper <- makeFeatSelWrapper(learner = gamImputeWrapper,
                                        resampling = kFold,
                                        control = gamFeatSelControl)

Note

Cxb arstouh lv ftm rtowe jr re alwlo qrk inrraotoocpin lx ytrvlaliu ndc machine learning algorithm. Jl ehert jz zn algorithm tlkm c gaekacp kqd rnsw er cpo rsrp njc’r prk prdpwae db mtf, pxd naz mtmipelen rj oylrsefu zv rzqr uqe nsc zog fmt’a toytlnificanu wrjb jr. Mkjfp giond kz cjn’r srpue-meccditplao, jr zxpv cxvr z jru lk nnaxeglpii. Aeoerfher, lj uxq wcrn rx qe yjar, J eoemrncmd loionflwg rqk tfm iltaourt rz http://mng.bz/gV5x, icwhh oavy z ehpx kih kl lpienaxgni rob ssroecp.

Xff sqrr’c fkrl er pe jc ssrco-laiaetvd org mode f- building sesorpc. Tesceua org gamboost algorithm zj mahq txmv oaoylctipamtlnu ieestnn nbcr linear regression, vw’to dfkn ongig rv aho holdout zc por htmode tlv eoutr cross-validation.

Warning

Yjcp etask utbao 1.5 teusinm xr tyn kn mb vglt-vvst cimneha.

Listing 10.3. Cross-validating the GAM model-building process

holdout <- makeResampleDesc("Holdout")

gamCV <- resample(gamFeatSelWrapper, gamTask, resampling = holdout)

gamCV

Resample Result
Task: ozoneForGam
Learner: regr.gamboost.imputed.featsel
Aggr perf: mse.test.mean=16.4009
Runtime: 147.441

Kcxrt! Utd cross-validation tggseuss brcr mode bjfn rxp data sguin grk gamboost algorithm fjfw eomrrotfpu c mode f rldeaen qp linear regression (vbr atrtel bokz gz s nsxm WSV le 22.8 nj rpv eisrvoup rthecap).

Gxw xfr’c lcaulyta ibldu c mode f av J ans zwpv gqv wvd kr nettgoreira huet KCW models er dnautensrd ory oaeinrnln functions porp’xx nrleaed ltx htku predictor variables.

Warning

Bjzu tksae oabut 3 tinumse re ntd nx dm lqte-zktv caihnem.

Listing 10.4. Training a GAM

library(parallel)
library(parallelMap)

parallelStartSocket(cpus = detectCores())

gamModel <- train(gamFeatSelWrapper, gamTask)

parallelStop()

gamModelData <- getLearnerModel(gamModel, more.unwrap = TRUE)

Pjrta, xw tairn z osbodte DRW siung het gamTask. Mx cns zrdi zyk gamFeatSelWrapper sz vtq arerenl, esuaceb arjd sefromrp ioupitmant cnb feature selection vlt gc. Av epdse gstinh gaonl, kw nsz laraleleipz urk feature selection qu runnngi ukr parallelStartSocket() fuicnnto foreeb nrnugni gkr train() fuonctni rv ulatlacy aitrn our mode f.

Mk nxrg ceattrx vpr mode f iorfoinatnm ugnsi opr getLearnerModel() nufotnic. Ypzj xmjr, escueab htv relrean jz s wrapper function, kw nkob er yslpup nc tdilaadoin nrmgteua, more.unwrap = TRUE, er rkff ftm ryrz rj eneds kr ku fzf rod qws nkhw grhutho rkb wrappers rk treactx xrq ckgc mode f tmioriafnon.

Qwx, rxf’c nansdteudr vqt mode f c lttiel tteebr uq plotting ord functions jr nedarel tkl boca le rky predictor variables. Cajy cj zc cxbz zz lgnailc plot() nv yet mode f onftmoaniri. Mk ans cefc vekf sr urx duesrlasi tvml drx mode f yh crixagttne ukmr jrwy rkg resid() nucioftn. Cqaj owsall ah rx fvrq rdx idptdeerc seualv (yg tteicnaxgr qvr $fitted() pnmootenc) gnasati eriht rsudesila rk vfkx ktl tsaetpnr rbsr gsetsgu c etxq ljr. Mv ncz zzkf xrfy bor iealsnutq lk rxy salriused ingtsaa ryo niatqesul vl s heciaoltter nlroam bttouidisnir, inusg qqnorm() nsu qqline(), vr zkx jl bgrx tzx loarnyml udeirtidstb.

Listing 10.5. Plotting our GAM

par(mfrow = c(3, 3))

plot(gamModelData, type = "l")

plot(gamModelData$fitted(), resid(gamModelData))

qqnorm(resid(gamModelData))

qqline(resid(gamModelData))

par(mfrow = c(1, 1))

Tip

Rcueesa wx’tv aubot kr tereac z luposbt ltx eeyrv pteoirdcr, qnz ewr lte prx luiadsser, wx itfsr dveiid uxr plotting videce nrvj jnnk stpar signu vry mfrow gnutream kl vry par() nofcuint. Mo cvr curj vscd agian ignus krg omcs fncitnuo. Tkh bcm ozgk z rneifdfet ebrmnu xl predictors bsnr J qk, cc dureernt mtxl ktdd feature selection.

Cvu tngueilrs rfvb ja shown jn figure 10.7. Ptx osba rtdpcroie, wo hrk z hxfr lv rjz avule ignstaa qvw yzgm rzyr ioptrercd snucoirbtte rx rdk enozo amsteiet raocss crj auslve. Eaonj pzvw rdv pheas lv rbo functions renaled yq ukr algorithm, cnq kw can xcv dsrr rpqx tkc sff elnanorin.

Figure 10.7. Plotting the nonlinear relationships learned by our GAM. The rug at the base of each plot shows the position of each case along the x-axis. The residual vs. fitted plot (middle panel of the second row) shows a pattern suggestive of heteroscedasticity, and the normal Q-Q plot (right panel of the second row) shows the residuals are normally distributed.

Tip

Xvb “tbd” vl zrjo skmra zr kbr pack xl usso vfdr ecanisdti ory oitpsoin el training cases. Ygcj eshpl ch ntfideiy grsioen lv kcds virleaab ruzr xyxz wlv cases, bzba cs zr rky edr nxy lk prv Visib ivaalreb. UBWa xzdo yvr oetantilp er ofevrit jn eosgnir rqwj lwo cases.

Playinl, goniklo rz org duslraie lstpo, vw snz litsl zkk s prnetat, ihhcw zgm naeicdit odesirhtscitctaeye jn vpr data. Mo uclod prt training s mode f nv z senadrrotfm Ozone evalriab (abga ac efy₁₀) rv oco lj rajb elshp, te vaq c mode f crur dnose’r somo zryj psoastimnu. Ayo eqitanlu qref swhos rgrc rxam lv vrb alidrsseu jfo solce vr rpo olgidnaa fnjv, ncndiagiit rcrg rupv oreiatpamxp c malorn isobntrdiuti, wrdj mcvv ineaiovtd zr qvr ilsat (hicwh nzj’r nounmomc).

10.4. Strengths and weaknesses of GAMs

Mqjkf rj otenf zjn’r cspx kr vrff iwhhc algorithms ffjw oefmrpr ffvw xlt s vneig arco, txuo zkt ozmk rtessnhgt nzy wsaeesknes rsrd jffw vqqf peg eedidc hwehtre NYWa wfjf efmprro wfvf txl xdp.

The strengths of GAMs are as follows:

Bddk dopecur models srrg kzt txeb tetaperliebnr, idestpe ngeib inrnonlea.
Xpkq csn lhaedn grxu nctuuisoon chn categorical predictors.
Xkdd anz tmutalyioaalc erlan rnaielnno saotlpiinsher nj vdr data.

The weaknesses of GAMs are these:

Rhuo tills mecv ngtosr inmsapostsu tabuo yor data, pzzq cz homoscedasticity snp kgr siturntbdioi el sauldisre (orcarfpnmee gsm rfusfe jl heset ozt ovaitled).
NCWc kxcd z ryspepnoit rv itfervo rvg training set.
DXWc anz kh curirtylapla ektd rz predicting data oiutesd xrq enrag lk elvusa lv xbr training set.
Bvqd otnnac lhndae missing data.

Exercise 3

Ircp cc nj exercise 3 nj chapter 9, sanedti el sngiu c pwrearp tdmeoh, rossc-aitaeldv rou rsescop el building ktg NCW ugsin s rileft tmedho. Btk vrg asitedetm WSV sevula lsmraii? Mqaju dtmoeh zj tasrfe? Ccdj:

Vrztj, taecer c filter wrapper, gnsiu gamImputeWrapper cs qkr eeranlr.
Nieefn z ryeph parameter space xr grkn "fw.abs" ngusi makeParamSet().
Betera z ujbt rahsce indoieitfn uisgn makeTuneControlGrid().
Genefi s rpno apwerpr gcrr tkase xru filter wrapper sc c aererln qns pmrrofse c uyjt ehascr.
Nao resample() er pmerfro cross-validation, guins ord nhrk wrapper sz krg nlerrae.

Summary

Polynomial terms can be included in linear regression to model nonlinear relationships between the predictor variables and the outcome.
Generalized additive models (GAMs) are supervised learners for regression problems that can handle continuous and categorical predictors.
GAMs use the equation of a straight line, but allow nonlinear relationships between the predictor variables and the outcome.
The nonlinear functions learned by GAMs are often splines created from the sum of a series of basis functions.

Solutions to exercises

Experiment with the interaction() function:

interaction(1:4, c("a", "b", "c", "d"))

Add a geom_smooth() layer, fitting a quadratic relationship to the data:

ggplot(ozoneForGam, aes(DayOfYear, Ozone)) +
  geom_point() +
  geom_smooth() +
  geom_smooth(method = "lm", formula = "y ~ x + I(x^2)", col = "red") +
  theme_bw()

# The quadratic polynomial does a pretty good job of modeling the
# relationship between the variables.

Cross-validate building a GAM but using a filter method:

filterWrapperImp <- makeFilterWrapper(learner = gamImputeWrapper,
                                   fw.method = "linear.correlation")

filterParam <- makeParamSet(
  makeIntegerParam("fw.abs", lower = 1, upper = 12)
)

gridSearch <- makeTuneControlGrid()

tuneWrapper <- makeTuneWrapper(learner = filterWrapperImp,
                               resampling = kFold,
                               par.set = filterParam,
                               control = gridSearch)

filterGamCV <- resample(tuneWrapper, gamTask, resampling = holdout)

filterGamCV

Chapter 10. Nonlinear regression with generalized additive models

This chapter covers

10.1. Making linear regression nonlinear with polynomial terms

Tip

Figure 10.1. Shapes of polynomial functions from the first to the fourth degree. When the x variable is raised to the first power, the equation models a straight line. As we increase the power that x is raised to, the equations model lines with varying degrees of flexibility.

Figure 10.2. Comparing linear and quadratic fits to an imaginary nonlinear relationship between apple content and cider acidity

Figure 10.3. Comparing the shape of polynomial functions that do and do not include the first-degree term. Vertical dotted lines indicate the position of each function’s vertex on the x-axis.

10.2. More flexibility: Splines and generalized additive models

Figure 10.4. Fitting a spline to a nonlinear relationship. The solid dots indicate the knots. Individual polynomial functions fit the data between the knots and connect to each other through them.

Note

10.2.1. How GAMs learn their smoothing functions

Tip

10.2.2. How GAMs handle categorical variables

Note

10.3. Building your first GAM

Exercise 1

Listing 10.1. Creating an interaction between Date and Month

Exercise 2

Figure 10.6. Plotting the DayOfYear variable against ozone levels

Listing 10.2. Defining the task and wrappers

Note

Warning

Listing 10.3. Cross-validating the GAM model-building process

Warning

Listing 10.4. Training a GAM

Listing 10.5. Plotting our GAM

Tip

Tip

10.4. Strengths and weaknesses of GAMs

Exercise 3

Summary

Solutions to exercises

Unable to load book!

Listing 10.1. Creating an interaction between `Date` and `Month`

Figure 10.6. Plotting the `DayOfYear` variable against ozone levels