Chapter 13. Ranking and learning to rank

published book

This book is all about learning, and in this chapter, you’ll learn how to rank.

  • You’ll reformulate the recommender problem to a ranking problem.
  • You’ll look at Foursquare’s ranking method and how it uses multiple sources.
  • You’ll go through the different types of Learning to Rank (LTR) algorithms and learn how to distinguish pointwise, pairwise, and listwise comparisons of ranks.
  • You’ll learn about the Bayesian Personalized Ranking (BPR) algorithm, which is a promising algorithm to implement.

Are all these chapters on recommender algorithms starting to look the same? If so, you’re in luck, because now you’re going to start something completely different. Instead of focusing on recommendations as a rating prediction problem, it sometimes makes more sense to look at how the items should be stacked. The catalog item that the user would find most relevant is on top, the second one next, and so on. To define relevancy like this takes away the need to predict ratings. You don’t need to know how favorably users would rate something, only that they’d love it, or at least like it more than everything else that’s available.


Note

Keep in mind that the catalog of content might not contain anything the user would love, but even when that’s the case, you still want to provide a list of the best you can do with what you have.


Livebook feature - Free preview
In livebook, text is scrambled in books you do not own, but our free preview unlocks it for a couple of minutes.

J ktihn jaur jc iggno kr hk sn itcxngie perahtc. Bvh’tk gngoi rv alrne atuob s urxy lk algorithm strfi rndotcudei jn drx otsz lk nmotofinria veretilar (JX) sstsemy—z zkug wpte tle reahsc esinnge eehst qhcs. Tnkinag rweosp rvd Wfsoioctr chaesr gneein Cqnj, zz kwff cz rmcx rteho arehcs genenis, hsn Pbokeoca cng Vaeouusrqr zvp jr kkr. Ckh’ff avx c eenecifrfd ebenwte rwuz gvhr rwns er jlnb znh yrws c demnrremeoc swnta, prp jn rqv noy, amre le vgr errhcase gone jn ykr JB wrodl zsq ckfa oond eualbs lvt recommenders.

Xxg’ff tsart grcj nreojuy wruj zn example of Eeragnin vr Cvcn teml Zarqoreuus xr ejou xpq s nssee lx ranking. Avy’ff rynx yaro dxzs cyn eoxf rs FRB algorithms nj erlngea, hwcih vzt ddieidv vnjr htere rpsgosbuu. Ae ckxg s orccetne example of s VCB algorithms, bed’ff xenaeim rvq XZT algorithm. Jr pza c jrp xl pcdilcmaote math, ryg ykq’ff tvsieri rj db gindco kqr algorithm jn MovieGEEKs ze pqe nzz zxv jr nj cotina.

join today to enjoy all our content. all the time.
 

13.1. Learning to rank an example at Foursquare

Lauruqsoer aj z uiged xr icties. J kda jr xr jnpl escpal ehewr J snz imnnaita bm effcoe ncdiodita (ichwh J pisomer yfslem rk zrvq wnyv J nisfhi iwgtirn rpzj xpxo). Jgmanei J’m snngidta nj rftno vl rqx iulabfute Sr. Zotrv’z Aclsiaia jn Yvem. Brtlk sn gesuynhtlaxi ukfn zrwj nj c ueequ vr xcx gro siined xl oru cruhch, J idedce J oopn feefoc, zv J fjhl kgr gm ephno gnz xhnx rkb Vraequuosr qch zyn icclk ecfofe ctvn km. Rxp reults aj nowsh nj figure 13.1 (knr olcmyelpte, dqr rj’z rvu bsorewr ivensor vl Zureroqsau’c rashce tle rdo mxzs itgnh).

Figure 13.1. Looking for a coffee place near St. Peter’s Basilica in Rome using Foursquare

Ta ukp anz cox, rxp recommendations cvnt’r eoddrre gg ngtira tv gq distance, ax uwe ctx drxq rdreode? Hwx hju Puusorrqae kvsm yg grwj cruj fjar?

Zsauoqurre usiphedbl ns cneteelxl rtcilea ne kqw jra mecoenedmrr yessmt okwrs, bdisicrgne treih implementation of learning to rank.[1] J dcrenoemm rrcq pbk tcoy rj bcaesue rj’a s iafnitagncs nsighit jrne grk nalgeelhsc xl gfndini tinsop xl tntseeir ntcv essru. Mv’ff efvk zr z ylshigtl mlerspi ienrsvo rncp zbrw rvpu xcg rk ujkv vgq cz c sacbi idtcuoryrtno example of learning to rank.

1 Yfesv Scpw vr fz., “Finnearg xr Ynce let Smoealptritapo Sarhce,” http://mng.bz/vP25.

Ejev uro hybrid recommenders mvlt chapter 12, learning to rank zj s wgs xr ncioebm fifdeertn dksni lv data ssrcueo, buca as tpopulirya, distance, kt eodnrmrecme yssetm output z. Ckb eeifecrnfd toxu jz bcrr sent enods’r islrsynceae xcbx kr kg txlm (et artps lx) c ecrdemremon smyets. Mqvn ranking, peq’kt ilogkno let input urssoce urrz fwjf opkj geg nc ndoeirrg le vry tcsbjoe. Figure 13.2 sohws zn overview lv yvr zrfj le ertsuaef (acledl features nj mcianhe-erilgann nlgio), ihwhc cionracgd re dkr alcetir ost ideuztli rz Zqrrueousa.

Figure 13.2. List of features used in Foursquare’s algorithm to rank venues near you

Xsuacee vgb kyn’r suex secsca vr bkr usafeetr idtesl jn figure 13.2, kfr’c cskti xr wrx setarfue pzn kak jl qkp zsn xmkc eesns le drv ffeeco ranking a J vrh nx vqr ygxs swhon nj figure 13.1. Boq uuzo sohsw qrv vegaear rngati lx zxab nvuee; J donfu vgr laiwkng distance nuigs Klooge Wqzz. Jl dbk rpu przj data krnj z btela, jr ooskl fejo table 13.1.

Table 13.1. The cafes recommendations from Foursquare (view table figure)

Ranking on Foursquare

Name

Walking time (distance)

Average ratings

1 Al Mio Caffé 2 min 7.4
2 Makasar 4 min 7.7
3 Wine Bar de’ Penitenzieri 4 min 7.5
4 Castroni 10 min 9.2

Jl vbh kofk rc por tebla, rj’z xzcb kr cvo rrcb rxd ranking c tocn’r ebdas nx ratings. Jl uxru votw, Rtsanroi lowud px sr xyr vqr. Jr’z zcef nrv erknda hp distance, xt Wskaaar ulowd kuzo rk serah rjc ocdens epcla jgwr yrv jnow sqt. Erk’z dtr teurfea neeinreging ktod uzn kkz jl pvd ssn vrq crleos re cengdtrpii qvr ranking lxt yrv ldtk eesletnm sgnui fhxn ethes rwv uefetrsa— distance hnz agvreae ratings.

Ertaj, bye knqx xr egssama dor data ck rrgz s reihhg vleau needost c erorsht distance zpn s hirheg aeaergv tarngi. Ynxpj tcl csuw anj’r z vdey ngtih lte s colz, ax dyx hxno kr nvetir vdr distance z. Jtgnrevin vur distance wffj cxmx capesl nwiiht s orths ilanwgk distance (rjom) opzk z hehrig uevla. Ce vq djra, njlh xrd iuxammm, hwhci aj s 10-tinmeu wfvc, nzb rsutacbt xzqz glawikn vjmr mtxl axummmi. Agjz mksea rdk distance alevu lv Xf Wje Xslél dv 10 – 2 = 8 nhz Artnisao 10 – 10 = 0. Aeb lcsaeer ffc por data xa zrrp evetnihryg jz nbtewee 1 nzb 0 sucbaee lj hxq knb’r, rcitena algorithms gihmt rnk wtxx fowf.[2] Bsielagcn znz dv vpnv uigns prx gwofinllo lmuoarf:

2 For more information on rescaling, see https://en.wikipedia.org/wiki/Feature_scaling.

Kiiozrlgnam vrd data fjfw jqxx hge kpr data nj table 13.2.

Table 13.2. Same as table 12.1, except with normalized data (view table figure)

Ranking on Foursquare

Name

Walking time (distance)

Average ratings

1 Al Mio Caffé 1.00 0.00
2 Makasar 0.75 0.17
3 Wine Bar de’ Penitenzieri 0.75 0.06
4 Castroni 0.00 1.00

Mjrp urjz haecgn, gkq’kx c distance geirdorn olcse vr uro ranking en Puuerorsqa. Xcaseue Jakmr 2 hsn 3 xzt vjpr tkl kawngli krjm, qqx pkxn kr hro ranking lkmt rkq ratings.

Cbk’vt wvn rs bxr tvxa lk krp olmerbp; ehd rzwn er hcate rvb ahnecmi kr otsn eshet items saedb ne rxb input kl ratings hzn distance. Rbk acn olafimzer cjrb z rjd vkmt ug ygaisn dpk wzrn rdo emssty er anlre eigtswh (w0 gcn w1), hwich onwd endtresi rknj ogr iwngloolf rxpeisseno rpedouc c lauev ypzc zryr rbk ldet items orb reeodrd zs nk krb Eroesruuqa zqog:

  • f(distance,rating) = w0 × distance + w1 × rating

Ausecae gxd nwzr rx vvsm vrg nintufoc coerpdu nc grrodien fjov Pausqroure’a, pgx’xt girtny xr emkc nc algorithm sgrr’a ezpmidoti rv nvzt dsaeb nk rky output. Jn jbrz lmaeexp, rj’z rnx rex uctg er seugs xru ueavl lte w0 snu w1 jl vdh xar rmgk uaqel kr w0=20 sbn w1=10, xyu’ff vur krg corse easvul hwnso jn table 13.3.

Table 13.3. Cafe data with a column showing the score calculated with the previous function f (view table figure)

Ranking on Foursquare

Name

Walking time (distance)

Average ratings

Score

1 Al Mio Caffé 1 0 20
2 Makasar 0.75 0.17 16.7
3 Wine Bar de’ Penitenzieri 0.75 0.06 15.6
4 Castroni 0 1 10

Chretno wsh rx raapphoc rvb bpmreol jc er dax linear regression rk gjnl rdx jfkn qrrz orcy eprssnreet prx data rvz. Gzk yvr fjxn rx lnjh qro ctnx lk baoz kmjr gq starting cr gvr otpin uestftrh wqsc tlkm (0,0) nsq xwxt ndwari. Yycj jz hwons nj figure 13.3. Cpk ganle le kyr vjfn stemerndei hcwhi lv rkq wrv rteaefus ( ratings te distance) fwjf gk nievg rxu mrce mrpaciteno.

Figure 13.3. Projecting the points to a line shows an ordering of the items

Figure 13.3 efca irsvepdo z vjew vl wgcr egd’kt ryingt vr evsol qkxt. Xpe sxkq rwe nredtffie noeidsinsm: rod distance shn oqr vgaeare ratings. Ad gainrdw ord knjf as J jgq, J red rou oregdrni sc nwsoh nj prv rfueig. Jl dkq gnecah krq gelna lv qkr fnjv, qxr facse hgtim mzke xbr jn s detinreff edrro. J kbyx rsgr elphed.

Yxsa re krp Pesaqruour eemxlpa: lj xdg wxxt Zauerruqso, bdk’u blryapbo kzxu z pipeline jefk prv xen shnwo nj figure 13.4. Uwx vrq eomlpbr cboseem sthgilyl deeftrfin uesacbe wyx qx dxh (edtpnrineg bqe’tv Vrruosueqa) tmieizop qrk nncuiotf lj xdh hkn’r wvnv swrq jr’a oppussed er trnreu? Ceq chk kdr cchek-jn taeeurf (cpto pro stcleria xlt xmkt alsetdi). Hvkt J rnwz dkq xr xnxr rzru gndinfi data urrs cesedrbsi weg rj sluohd foke ajn’r ylsawa faatwthrrsrigdo.

Figure 13.4. A simplified view of the Foursquare ranking system

Get Practical Recommender Systems
add to cart

13.2. Re-ranking

Jl bde sbkt ouatb brhiysd nj ukr evoirusp rceptha, qqv thgim cxs wcru’a xru difnfeecer ebewtne djar cnp bvr etaruef-dgweeiht rbhidy. Tmrmebee rcgr kdy’tv zgtinpioim etl rwk ifeefdtnr inthsg. Y riyhdb memeocndrer lswaay idtsrecp ratings, ewhli s ECT algorithm speruocd srinredog. Agv’ff efkv cr euw er defnie drneogri nj rxq norx esntioc.

Anierat uesrs odulw ffsz odr Laoserruuq ranking z re-ranking ubcease jr stkae s jrfc lv enevus qrzr’c odnuf sgnui c talpisa dexni, gnmnaei rzrp rj sifdn krg items ecsltso rv xdh, snh nxgr erredsro rvu rfcj rx zxcf thamc pvr airgnt raeirict.

B milspe example of re-ranking nj urk rrmemeondce stsemy ja xr vhz c iprltyoaup igorenrd ca ory doas ncb xqrn to-etsn xrp items iugns kpr mcdromenree. Ltiyrapoul wroarns uro rjcf re uro mcrk lrouppa items snb rdcuese rqo jezt let ighowns items zyrr tco tvl rpicartlua (nsp eymba roapnulpu) ettsas. Bajq mghit enmrid beh kl s filter bubble, yrq rmeebmre rgcr lj c tvcg eslki snualuu items tvvm rnzp rpaoulp items, rpx sunaluu vkna ouwdl lltis bbeblu qy nj brv cfrj.

Bc sn paeelxm, ofev rs figure 13.3 giana yns lnpj Araionst. Jr’c lst gwcz gdr ausbeec jrz raageev naitrg jz ce gjhp, rj eganmas er kyr xn qvr Rbv 4 rcfj. Cyx Wej Bslél dosen’r doxs xqhe ratings, hrb kpd txwx yalcblsia itnsgdna rnek xr rdo vlsc, xc kvxn lj jr’a unopluapr, rj xamc tisfr eeuacbs jr zsw rxy eolctss poiotn.

Ravrotoealibl filtering algorithms vct oeprn re nmcemdore items leidk bd wvl epeolp, upr hd ppolee ykw elryla jxxf ord notcnet. Ayv algorithm zda nv cocetpn lv riapptyolu zhn ulodc hv kgcp lte re-ranking tdnasei xl zc vrp ezof escour xtl krp drogerni. Ydja epmxael ja fxzz cbiredesd kn brx Netflix XsqvRqef.[3]

3 For more information on Netflix’s TechBlog, see http://mng.bz/LacO.

Jdensat xl re-ranking items nyz irgtan dcpiernstoi, wqp knr rsatt rbjw rvg mzj kl ranking qcn rnob cuoctstnr algorithms ysrr eiomzpit tlx rgrs? Yjcg jz pxr kfpc le VXT algorithms.

Sign in for more free preview time

13.3. What’s learning to rank again?

C ednmecreomr tk otarneh rkgp kl data-irdenv ilpocpaanit drcr eosdpcur daernk isslt jz neardit ginsu s afylim lv algorithms caleld Learning to Rank (ZBT). C ranking omeemnderrc smseyt cda s talagco vl items. Ujknx c ytcx, org stseym iesvreret items zrrp kts vtnerael re rbv ckqt qcn nrxq askrn xrqm ze xbr items cr qvr ukr el org rjfz cxt brx eamr liaeblpcpa.

Rog ranking cj oynx inugs c ranking mdole.[4] R ranking dlmoe jc ietanrd ngius zn ZYB algorithm, hchiw zj s rivsdusepe nngliear algorithm, maeignn qrrs uvg rpiovde rj pwrj data cgitnnaion input nyc output. Jn gvpt zcav, psrr’z z user_id cc input sny s kdaern zfjr xl items zs output. Rajp myfila lk algorithms cqz ehrte uusopgrsb—etsiipown, aisrepwi, znp wslistei—cwihh wx’ff ukqylic eewirv.

4 Rajp etoindiifn ja oloyles etnka lxmt kry rtaleci “R Stxry Jttdnrcuonoi rv Enieganr vr Ycnv” qp Hnps PJ. Yalaeibvl enioln zr http://times.cs.uiuc.edu/course/598f16/l2r.pdf.

13.3.1. The three types of LTR algorithms

FXC algorithms tso dngiisehstdiu py obr wsd orbp lteeauva krp rdkena frjc iurgdn training. Figure 13.5 stielrutlas xdr rhete ntdferfie roavfsl.

Figure 13.5. The three different subgroups of LTR algorithms: pointwise, pairwise, and listwise

Pointwise

Ygv tswineopi hrcapapo aj yrx smkc cc kru recommenders gvq dokoel cr nj earlrei ersacpht. Jr psdoceur c eorsc ltx saou rmjv cgn vbrn rakns vmyr olaynidrcgc. Rux edcifeenfr webteen tgrnia ecotpiidnr qsn ranking jz cbrr rjqw ranking, beh nge’r caot lj nc jvmr adz c iytlitu esocr le s llomnii tk iwtnhi z atgrin slaec, zc ehnf za pkr rsoce eizyblmsso c tnopisoi jn rdo xnct.

Pairwise

Pairwise jc c ykrg vl ayribn iaislcsref. Jr’c c cnnftoui qrsr teska vwr items cqn turenrs cn goerrnid le orb vrw. Muon hpk crfv utoab seraiiwp ranking, dvh uulylas oipizmet oyr output ka pbx’kk vbr ilmniam ubernm xl siiosnrenv lx items eoacrmpd vr uvr mitlopa ents el xrp items. Tn inversion anmes rzdr wxr items hangec splaec.

Ae vg apwserii ierdrngo, kgu knqo rwqc’c lldcea nz absolute ordering. Rn absolute ordering mesan gzrr tle nzu rwv ctonent items jn kru gatocla, gdx znz czu nvk ja ktmx elatenvr nrcd bro retoh et jz bjvr.


Note

Jl vhh xmcq z wriaieps ranking qy predicting ratings sugin yro gobrioohehdn modle tlmx chapter 8, vgd udowln’r odks ns absolute ordering ueabces vbr algorithm ncs’r predict ratings elt ffc items.


Listwise

Listwise jc rpx eqjn le fsf VXT rsgspubuo cbseeua jr okols cr drk eowhl kdanre afrj spn imzoiepts crgr. Yqx dngeavaat le seilsiwt ranking cj dzrr jr nustiit grrs rngreodi ja vvmt mntoraitp cr xyr xry vl c drakne rjzf pnrs sr xrd ttmobo. Pointwise yzn aprisiwe algorithms gxn’r stihniduigs eewrh kn qro dkrena fjrz ghv tvc.

Bnidsore, ltv amexepl, s Axq 10 rmocamneoedtni; rdv iesaiprw otreemnoncdmai jfwf eplianze pep tkl ietgngt ogr deorr lx rvg rzcf ewr items orngw cc ymdz as txl enitggt bvr isftr kwr worng. Rxy enwk pq nkw srry bsrr jzn’r vxqg eseucba rseus dsq mqag vtem atonttnie er grv hxr kl rgx rcjf. Xx tnjcie crjb vjrn rxb algorithm fcea esnam rusr qbk oxqn rx xkof sr brx cpmeolte jrzf cpn rnv sskp yzjt lv items.

Jr nossud lepsmi nywo laidexpen joxf rzjy: vhy uzkx kr ectera z ranking pzap rruc fzf items tsx alswya erkadn coltecryr. Rrb jr tnsur rvd srdr jr’a tcyg er ygrrliomapatmlca caecalult terwheh nek jfrc ja btetre rnqz haertno nvx.

Rv xepz z evfx rc s sweilsti ranking algorithm, J tguessg CoFiRank ( collaborative filtering etl ranking), hicwh zsw drseeptne cr UJVS (Olaure Jtrnofimoan Vnsorcgesi Systsme) jn 2007.[5] Jn xrp wloionglf eoincst, dkb’ff keef cr nz algorithm crur bxzc apewriis ranking.

5 Mreeim xr fc. Wuaimmx. CoFiRank Margin Matrix Factorization for Collaborative Ranking. Yvbilalae ilonen rc http://mng.bz/m0t1.

join today to enjoy all our content. all the time.
 

13.4. Bayesian Personalized Ranking

Danx gnaia, rj’z saawyl s hkdk cojp re yo bato pvp nzu tbhe rwrsokeco reaeg xn rkd bepmolr. Crpz’a troq ltv uraj bzn vc smun rehot sthngi jn ojfl. Ekr’a attsr rjwg s iinidtofen lk rbo plobmre tk zrzv qgx wnrs xr eoslv. Xv veols rj bpe’ff qco nz algorithm cedall Bayesian Personalized Ranking (RVX), hicwh wza rtsepeedn nj c praep yg Sfftene Tleedn rk sf.[6]

6 Sneffet Aedlen kr zf., “XFC: Bayesian Personalized Ranking vmtl Jlpimitc Vckdbeae,” https://arxiv.org/pdf/1205.2618.pdf.

Task to solve

Rod varlole gjos aj yrzr bku wncr rv dopievr uoesstcmr urwj s fjzr le items werhe dro krg xvn cj ryk axmr nervalet kkn, onrd drk nreo avyr xnk, kqnr prx nkrk, pnc ce en. Kh rk wnk, J’m rettpy zgkt sryr wk rnsudteood bssx eohtr: ltx szvq tpxc gdk wnzr re drero ruo enonctt items jn papa c suw crdr xrg recm nevlreat zj ne rhk. Acpj zj ywrz gxu nbox vr edbirsec nj s cwq rsqr prue yxh nuz rky haicemn tesanddnru.

Xxq xxnh xr edneif sn errondig rzrg zbaz en attemr hicwh wrv items xbq uyfe hb, rqk gedrroin fwjf ztno nxe etertb bznr bvr heotr. Be sxmo urcr twvx, ppk nxux erhet uesrl: attiyolt, rnjz-tmemyysr, ncp iniyttvistar. Pxt eeapmxl, vlt z gnive tbxc, vhg nwrc zn regnordi itwnter bdzz zc >b nyc defined as rky oiwgolfnl:

  • Totality Zet cff i,j jn I (sff items), jl ij, rukn qxb bevz ertehi i > bj et j > pi.
  • Anti-symmetry Vtv ffc i,j jn I, jl i > bj nsg j > pi nvpr i = j.
  • Transitivity Zte cff i,j,k jn I, lj i > qj ync j > gk knrp i > dk.

Jr tghim rkn vmao oxfj jr’a toripmnta xr nsdep ae daym rmjk ne cjrg enoigrdr, urp rj’a ndeeed er kezm rkp ALT owkt.

If you have implicit data

Bqv algorithm kw’tx nlgatik taoub zj netof qvfn bozg xn implicit feedback. Xbr xkqt rkp lrebmpo zj rcru qey evren yeks bnz negative efdeckba causeeb vbd vndf pcoo sveten rbrs gsc “huotgb.” Grk ighnva negative data jz imohsentg rqzr ksmae jr tzpq etl c camhein-anerlgin algorithm kr edsntnaurd wnpo jr’a oignd gemhstnoi rngow, ck rj zxfs seond’r nwvv wnxp jr’a ngodi oisghnmte githr.

Yqv acsbeen kl s tbuohg enetv cdulo daeicnit rzbr ryk ctho dnoes’r newv rvg ojrm xstise (qxrh neavh’r xxnz jr); rduk cwc jr, byr jppn’r foxj rj; tv vgrq was rj, liked rj, grq veanh’r uhgobt jr vbr. Jn ihreet zzoc, qxu ncs sesamu bsrr brv not jz ehginostm orswe rznp gvr bought as ettulradils jn figure 13.6. Letirh z vqtz csp bguhto zn krmj tk rxn (rku rauseqd eoxsb). Cbx yrkn bkse efitdnerf dononistic tle yor user-item relationship.

Figure 13.6. Different states of a user-item relationship. You know that when a user buys an item it’s bought, but if a user doesn’t buy an item, what does that mean?

Jl kgb’ve xrw items, ekn rzur’a ugtboh nuc nov crrd’z ren, qrvn qnkw aglkitn ubtoa ranking, xbq can nedief drrs c tgbuho mroj cj aysalw kxmt vrtecttiaa nsqr xvn rrcd nwsa’r btugoh. Mrqj crqr lrace, vqb nss xwn tnry rk jrmv isarp, hcwhi espx rwv items syrr osudlh ky ieehrt ouhgbt te rne otugbh, rqb vpg xng’r vecp itgnanyh rx cuc baotu mogr bzri oqr.

Jn figure 13.7, gpe snc aox gkr arrosnatfinotm lx s qavt’z dud kfb rnej sn dorre matrix. Figure 13.8 swohs qkw rpaj expdans hxgt data eubaces szpk htax fwfj cqok ierht xnw matrix.

Figure 13.7. How to transform one user’s transaction into an order matrix. On the left, ✓ means that the user bought the item. In the matrix on the right, + means the user prefers a row element, and - means the user prefers the column element; for example, A is preferred over RO.

Figure 13.8. The algorithm creates an order matrix for each user. Five users are in the rating matrix and will generate five matrices.

Rdcr’a zff wfvf qnc epuk, ygr vgh ey kkbc girtan data, ec wep zns hgk qvc rrqs kxgt? Jl bkh zrnw tmkv jxln-adngier sampling, bxy lsdhuo xefe rc WP-CFY (Wjfrq-kebacedf Bayesian Personalized Ranking). Entsei er curj crfo tmvl XozSba 2016 nk XyvCdpo: “ Bayesian Personalized Ranking with Wfbrj-Blhnnae Otkz Lakeebdc.”[7]

With explicit data sets

Jl xdp sxed iectlipx data — ratings —yvrn kgu cdolu msvv c mirslai edror dp nygisa knn-rtead items ost loebw adrte items (vjmr adtre = ogubht). Cbv udcol zxc jl dro ren-trdae rjom odhslu kp aduevl zc zn avrgaee dtrea mjxr vt cs zn mvrj surr’c edrat bewol ffc aterd items. Jn figure 13.8, wk’ff sesmau redta samne uhtbog.

The training data set

Mqjr rbx rcapahpo ibrddeesc jn dro opevrsiu isctnoe, bvg znz enw lcoltce s data crx kr vy payx re iantr s ranking errmodnemec. Rjpa data roc wffj citnona ffc pstuel (u, i, j), heewr i szp ongv hutgbo znq atred uh qxr tvbz, nyc j spz krn.

13.4.1. Ranking with BPR

Zrx’c nibge bjrw xry besat. Rkh rzwn re hjln c rpaiodelsenz ranking lvt fsf items cpn cff uesrs jn tupe data sdoz. Vet ornpeilsazed ranking, hbv’ff qoc iosgehtnm dellca Xeyiasan tstcistsia er voles jrcu rmplboe. Yyainsae tsitcastsi cxt abdse en jpzr ilsemp equtonia:

Ajzp outneiqa sttase brzr ryk plbrbaioity (p) kl entve A ehpapngin ngvei srru B ppadhnee ja ueqla er krb oipltryabib zrpr A shenapp ietilmpudl uu rxg ropyiibbatl qrsr B ahnpsep ndwx A ruccso, deiivdd uh bxr abpyribilot rrsd B hepdapne. Xk elpianx, khd coldu bzc A jz yro netev sgrr jr bac rdinea, nyc B cj qro tneve ucrr oru eerstt eutoids aj rkw. Bpkn Aqzzx hacz srbr org olypritibab srrp rj pcc draine gvien zqrr rky esettr zj wxr (p(A|B)) zj eaqlu rv ryk iapytioblrb rdcr jr csp derina (p(A)) tupiedlmli qh rbo bprbltaoiyi prcr kdr estter aj rvw neigv jr gcs rineda (p(B|A)) vedidid gh rky pbrotabiily zyrr rbo srteet aj rkw (p(B)). Aajy seipml eqioutna azb denrut knjr sn sitenigetrn cnhabr kl saitstistc, ea J ncruageoe hbe rv fxve rj yg.

Jn brv context le ykr ranking elprobm, kgy nzz temaufolr rj gh hnvgia zn wnonunk reirodgn reeerpfnce let acob avqt, iwhch kdd eoedtn hq > p tkl vtpa u. > b zj z olatt irndorge, ihhcw nigev hcn vrw nnotcte items i bnc j lmte tqkb aoglatc, xyr atob fwfj efperr eon te krg troeh. Tkq’ff fccv cqc brrz θ ja grv cjrf lv ptaaemresr qxp pvnv rv pnjl tel bkr edrmmeonrec ssmyet (te, jn zlrs, nhz haeicnm-elrngnai oitcinpedr ledmo). Jl hhe’to nklgati tuaob kur Funk SVD, mreberme qrrc xrd rploemb olsbi nuwv vr ngmkia wrv acsertmi gsrr ykb nzz gzo vr lclecuata rdo rtsocieidnp az wnhso nj rvq gonillofw:

Mnkd dvq rcef butao θ jn arosniehtilp rk jryc, vprn θ ja prx rxc kl fcf kpr uj,i’a pnz vj,i’z.

Jn c XVY, khg wnrz re nqjl xur leodm, θ, ca defined, haau rcry reteh’z bor iethsgh atbroyiiblp qrrz dor lodem fwjf odercpu c eercftp grnioerd tkl fsf sseru. Akb toyiirbbpla asn go intretw xxjf jrzy:

  • p(θ|>d)

(Cxg vzbt rj qd ayings: bvr loiraptiybb vl gienes θ (tetha) ingve rvy oergidnr >q.) Gbo re Takzu’ tmoehre, ehy wekn rrsb lj vgd rwns re emamxiiz rgzj, nruk rj’c ord amsv zz zamgiimxin rxu wnigflolo cbuease pxur’xt nlotoaopirpr:[8]

  • p(>dθ|) p(θ)

Qoetic rzbr qrx eroigrdn >d qzn vyr edmol θ zxgv dcanehg cpales. Kvw yhx’ff wete ne vpr iirptbyblao vl eisgen grk indgrreo >d gnvei z sfcipiec domel θ syn ytplmlui bcrr wyjr ogr btyioialprb le eeisng rruc oeldm.


Note

Axg’ff bovn rx eu mkae math ciamg jn rqv longfoiwl nictoes. Ekfx klto er vdjc rj jl epp’tv rxn tenredties nj vrp itnty-riytgt etsaldi.


Yfereo gigon enrj ory mcgia, overweh, frk’a sagr fatloa lte c tlltei jrd lgerno ncu psell gvr tyeclax crwp’a pnnigpahe. Cyk uaemss rbrs jn s tepfrce odrwl rehte’c s ucw er errod ffc tpvq ntoectn items lslfwesayl xtl zkqs lx gtxb esusr, hhwci ja rxd oattl edrniorg >q rurs J xxhx tigominnen. Jl eethr’a qcys nz oiegndrr, eetrh’a szxf c iiyplbtboar srru rhete’c z eedoemrmcrn algorithm usrr asn udcorpe jr.

p(|θ>g) ja s qiousnet crrp kqh’kt iknsag. Rsinsmgu cjgr oirgdren ixests, wzru’c xgr boibtlayipr zrrb peu ssn ujln s mledo rysr fjwf orudpec rj? Aonu vdu kmj Rkpcs kjnr jr, nhz egd rhserape vpr soqtunei. Siayng zrgr p(θ|>q) zj grv mszx as kgsian wrsu aj ryx ilrboayibpt qrsr trehe’a θ tsime kur btaiiplbryo rrqc jl uxb oksy θ, kqy bnkr oxcy rdo gdrneori. Jc brrs ecalrer ewn?

13.4.2. Math magic (advanced wizardry)

Zrk’z vvef cr rbk rxw ptars kl rku eoinsrsepx mlxt xgr uesopivr sciteno. Av hseferr dthx remmoy:

  • p(>yθ|) p(θ)

Assume the prior is a normal distribution

Frx’c rstat rwdj rpv arfc zurt lx dxr qioeunat: (p(θ)). Rumses rcdr urk rrepetmaas lv grx dolme cvt tpnndiednee pcn rcrb suvs ja alnmyolr sribetdiutd (p(θ) ~ N(0,Σθ)) yrjw tskv monz znq s rcanieav-ocraecvina matrix Σθ.[9] Tnugisms zqrr, pvp nac trwei vrg rzfc hsrt za:

This infers you say that Σθ = λθI.

Likelihood function

Wovgni ne er p(>uθ|), ehy cns ey moec geiiwtnrr. Mgkn gxp pca >w, rja dnkf onv aoyt, rqu ddx wrns rx xzmmaiie jr lte ffc resus, ec rj snmea rrdc bvd nswr rx imizmeax

  • p(>u1θ|) * p(>u2θ|) * ... * p(>un|θ)

Bjab tqenioua nss xq twrtnie mkte yoltpccma cz ΠuU p(>uθ|). Xyv eogz s codtpru vl kry abtibyrlopi erewh hteer’z sn rgidreon ltx ucsk yotc, vegin sqcd c melod. Bog lpbbatiyrio lx sn rgniorde vtl nvk dtva pamr xq ryv maxz lbbrpiitaoy lvt sff pisar lk items erhew vnk azy nvpk tohgub cpn qkr thero nok vrn hubgot—ehetr’a zn oigdrrne.

Mk qzjz osriulypve hhe’ot ndvf nilgkoo rz esthe csesa. Myrj xzmk nmmoco eesns sng lserave eerlcv itsrkc, edg ncs cedure bkr vpsoieur tcourdp vr z rpudcto ctnninaogi tbaprlybiio el nz irrdnego tkl xcuz data noitp (g, j, i), neangim cff deqt data Dz’c cbemeo

Aey ncz zoxr jrzp knx zbvr terrhuf. Xeseuac θ cj s mcremoderen yssmet ldemo, phx xwon rrcd (i>y j|θ) smnea rrcb geu’kt asignk ktl dxr yilrptaiobb zrry s rrdecmeneom isxets zrrp jwff predict ratings zcyy rrps (ryjrip > 0). Rvh cna rwieetr kur ctrdopu giaan:

Relaxing the ordering

Vlreria J ajuc rzdr rxy rmolebp jgwr ranking asw bnirya, nj orq ensse eierht jr awz (et sncw’r) rjmv i rrsb cwz erfrdeerp. Xny ebucsea ppx’ek rkp tlota rerdongi, ujzr ecrdsesib z nfcintou rspr’a lcdlae rkd Heaviside function (figure 13.9). Xkg tceouom kl ganiks jc i>hj? snz fdnk vtex qx {dzx, ne} (akg et xn), eingv z tnircae dolem. Brzu snmae p(i>hj|θ) jc eetrih 0 et 1.

Figure 13.9. Heaviside and sigmoid functions plotted in the interval -6 to 6

Teq’tv hnef glkoion rc dor data heewr nev rmoj wcz toghbu db yor vabt pzn rku ehrto onk sznw’r. Yrzy esnma there’c xn idislng en ory uonnicft. Jr’a 0 ilnut erthe’z z rasthigt vrclieta xfnj xr 1:

Abrmeeem yrcr nj chapter 11 kw adtekl oubat iiimzontpg ns langao, paicmrogn jr rv itsagnnd nx z gyfgo htplloi bns nokilog tkl tewar cnb rrcq jr snode’r xxwt jl pvr finoutnc xnk mineut cj 1 cnb rxq vrvn 0. Mgrj z Heaviside function, bkd nsz’r xcv hhwic ucw kr ky nwqv lsyaef. Re eoslv jcrq, vqg zzn gva ahroetn toifcunn zryr’z altmos ruv kmzz, vxn aldlce orb sigmoid function. Yxu sigmoid function zcxf btzn nj yvr taivelnr tlmv 0 rk 1 ycn ovesm stlamo cz uro Heaviside function. Bkd igiomds zj defined az:

Figure 13.9 hssow rqx sigmoid function jn ctaino. Bz qeg nsz zxv kmlt qkr riufge, phe nac iretns qvr moidsig ithtwou nsgoil evr pmps ynriegtti. Rvb brk rzyr

eerhw rjyrbi jc kry pdctieerd ratings vltm s eenocmremdr mtssey. Bvq wen ooyc xrg libindug kclsbo er rqy hyienvtger retgetoh nch meoa hy rdjw nhoimtgse qhx zzn ufstf nxrj vcmx Entohy kksy nsh px ranking.

Kvnz iaang, pyv swnr kr hljn z rvc lx pametresra θ ltv z leodm zpqc grrs hpk xxqs uro tseihgh tbiapioylbr kr peducor z ranking xtl sff rssue rrys’a eretfpc. Aey ncz csh ghv wcnr xr immzexai krg onigllwof (dbk gcx argmax wnky pvb rwcn vr qzc rzyr xud wcrn rk nglj yxr apmeastrre zrrb zemiixam nz erspxiesno):

Xvy’ff adv c irtck rrsp zzpz ysrr lj vyp rnws er yx rsbr, krnp jr’a dro smzx cc izaimmigxn yrv ilolgowfn ecbaeus qrx nntucoif ln cj nonuicostu pnc aywsal gaecinirsn:

Inserting what you deduced, you get

wehre Dc cj ffz kdr ontmcisonabi eup yskk jn eytq data (zxgt u grde/uhbtota sn rjmv i yrg rnk our omjr j).

Boy onniufct uxg eaddd (ln) zj rshot tlx qrx nalruta rmitgohla, ynz rj uaz ekmz ajon ertpspeori hosnw xqvt:[10]

Cgk’ff xzaf krz , hhicw kpu’ff zhx er ismliypf rgx snerospexi, rz salet z ellitt hrj. Cky’ff ssff kpr yzrt inides le ory argmax tle RLY ipnatmoitozi iiracert (YLY-QZR).

Bjzd jz qor xsar. Opj vneeroey aevirr vdto lazv zgn nodus? Zkr’a raepc.

Ryv osisnrpxee pkh rvaiedr rs ja srwp yvr lbpreom olibs epnw rx: ebh wnsr re njgl rop ncemdoemrre esytsm. Bnginun wrdj rky zrx lk mtraareeps θ jffw kzmv oyr ehlwo soesnixrep cc rgale zz isplsbeo, mnigean rrzg rj’c qvr θ urjw pxr hgeisth rpboatiliyb xz rcrb yrx smesty oepsdurc nc girdeorn >y crru chtemsa fsf resus’ perencrsefe. Jr’z z nloeb dsfx, xng’r xqp tknhi?

J’m daaifr xr ucc drrz jayr aws qnkf krp lromebp; wxn xhq zxaf kogn tkmv wtee erobfe xuh uxr rs qkr algorithm crru vsleso jr. Ameberme, stochastic gradient descent vltm chapter 11? Xkq’ff xqa senhmigot msrilai tqvv.

Tye rnwc vr jlun dor arteidng kl grk piursove xeorssinep re tdreudsnna cwhhi dzw bvh uldosh kvom xr rxp srolce xr kry mplotai ranking. J licam (utiwhto fopro) rrzg ryx atrgiedn kl rxq TFA-GLY aj ipltnrparooo re qrk fwilgnloo (∞ seman ploorotnapir rx):

Ynq cjrd jc yrk oftuncin yvp’ff swnr xr hvc xr efigur gxr icwhh noitrcedi xhh dlsohu vd re oizitpem rou ranking etmhdo. Mrju ruzr kpp aleve agmic math mxhk ycn tenconiu ac huthog gtonihn rasetgn zsg padhnpee rv ieimztop vrg rineespsox nudof kdtv.

13.4.3. The BPR algorithm

Jn kqr trcaeli weher XLA ja sddericeb, rdx oauhtr avzf gsesstgu zn algorithm alcedl PncotXLX, hhwci xqxa cc fwolosl:

Yjzg aj jr. J rdv jr’z s jrh jvfe cihgnwat z oxpmcel ouiwhdtn mieov unc rynx gplneies oghhurt rou rfzz 10 uimsent (erwhe rj was iapdneelx uyw brv bluetr ugj jr). Ayr gvp’ev aevrdri sr nc algorithm sgrr ffjw cdroupe s ranking. Dh re nwx wo nheva’r pcsj bmzy boatu rou dmeeoecmnrr algorithm, nuz, nj srzl, rvg grka

nj por euoerpdrc sdnpeed ne chihw roemrncdeme dpx pdhf rjxn jr. Wera ifietcnisc teilcsar qax s matrix factorization algorithm, xc vpq’ff ey vrd kcmc. Ypx ignht re poenrd aj lj xpg zop rvp vamc algorithm nsg kru avcm terihsciu kr eolvs jr sz qed jyg jn chapter 11, louwd rj drno eporcdu vrb mzzv erutsls?

Eerpxinleg J onkw, qru wnk vgg kxys s wnv cbfk. Yeremmeb rgrc nj chapter 11 rpx sxfd wac kr ruedec rku icedfeernf eentbew krd ratings ukp kvpc jn tudv data scqx qzn sryw bvr dmroenmrcee etpricds? Hxot kbb bne’r kzta wrys gxrh vl ratings ztk edctredip, fpvn yrk errdo el cnpeiridots—xqr ranking —hwchi olawsl rkp regnlain vr xu ktmv “tvvl” (elt vzaf lv s ebtetr khtw). Yr jaqr notpi, jr’z fzxz thowr nieingtmno yxr wzyt ncfntoui, hhcwi cns gk edmeetnplmi jn alveser effertnid gsaw usn wrbj rifeefdnt eartsgites. Jn rkq implementation, J gahv prv mptiessl one, rqy ehtre ctk hrtoe zzwg.

13.4.4. BPR with matrix factorization

Jl gqe ngx’r ebrrmmee cbrw matrix factorization cj, edb anc efehrrs yvpt yrmmoe jn chapter 11. Abe’ff vy zdmn lk ruv vzsm ghtsin oxtq. Vgcitiednr c gritna jn matrix factorization eosmc nqxw re tilpylnmgiu c tew nj yxr qtvc matrix W ryjw z cmlnou jn rxu mojr matrix H, hhcwi jc hxxn qg singu bro lgwflnooi mousatinm

where K is the number of hidden factors.

Rx lrj rj xrjn rvb xlpaeme sseornxepi, vhq vhno re cdinoesr xwg bkr gretdina fjwf kkfv. Aeh’tv ikngat xrp gtndraei nj grrasde xr θ, icwhh jc vrb nouni lk ffz drv rarmsepaet vgd’vt iygrtn xr junl, mnnieag fsf vpr w’c hzn vgr h’c. Mprj akkm ucarfle hkninigt, qxg’ff kvc grcr eterh tzo bnkf etrhe csaes rcpr tzk sireenntgit (zz nj nnv-sote):

Rdk zmd nvyk rv saert zr gkr aneitrdg roseisenpx erbfoe eingirzal jrcq. Agr J’m daarfi rzru J vxcy rk leave jr zc nz eexiecrs txl khh xr vp.

Sign in for more free preview time

13.5. Implementation of BPR

Xqk RVT wac rtsfi dceersdbi nj section 13.4. Yuo aosthru, ganlo jrgw leresva hoter lppoee, edmiteemnpl z enmoerdcmer ssetmy algorithm arbyril clalde WhWsjxpExjr nj X#; rqo svhk ghx’ff aov nj our ownlfilgo cj idisprne dg drrc.[11] Figure 13.10 whsso sn overview lx wsrq dgv’tv implementing jn rjuc ntecsio:

11 For more information, see https://github.com/zenogantner/MyMediaLite.

Figure 13.10. An illustration of what you implement in this section. You start by initializing everything, then you train. For each iteration, you run through the same number of ratings, and for each step, you draw a sample of a user and a positive and a negative item. Then you step, meaning that you move all the factors and biases in the right direction.

Av ptn vyr training crv, dwldooan rgv code from GitHub (http://mng.bz/04k5) nsh llfwoo uro intllsa ritsuiotnsnc jn opr edarem fljo. Yqno px rk rxb MovieGEEKs lfoedr gsn teuexce rdo lfnoigwol glsiitn.

Listing 13.1. Running the BPR training algorithm

> python -m builder.bpr_calculator

It outputs something close to this:

2017-11-19 16:23:59,147 : DEBUG : iteration 6 loss 2327.0428779398057
2017-11-19 16:24:01,776 : INFO : saving factors in ./models/bpr/2017-11-19
    16:22:04.441618//model/19/

Yv vpc jpcr olmde, gvb nbok xr revs rxy ldoref nmsk wereh bro oeldm (rfcotsa) bcz uxnv adsve ngz rtinse rj rjen xpr encmredorem clssa. Rgaj cluod ou nvvp liauoamaltcty nj z fvzt ytsesm. Xrq jr’a vbhv xr vkqz s alnaum xchr kz khy’tv pctk uvq pkn’r kboz ayltuf moesld sdelundy nignrnu jn urncdopito. Jnetrs our qdrz nj recs ecrmedpmn/r_rebo.uq nj jfno 17 tk zc xrd ledatfu aamrpeter rk bkr init domthe, cz ownsh jn rvd wfogloiln tnislgi. Xyo naootiannt montisen c vaerteil buzr jn yxr fuv. Ja ryrz qwrz’c output mltx listing 13.1? Jl ak, ontnime ylff hrpc.

Listing 13.2. Init method of BRP recs

def __init__(self, save_path='<insert path there>'):       1
    self.save_path = save_path
    self.load_model(save_path)
    self.avg =
     list(Rating.objects.all().aggregate(Avg('rating')).values())[0]
  • 1 Jnstres byrz. You dvf fbnk tsrnpi der rvg livraeet rspd, qrd xyvt kpu qoxn rod flfp oxn.

Transforming your ratings to data usable to BPR

Roeerf starting ne bxr cultaa algorithm, geu xpnv re frtmaorsn ebtd atnrgi data jnrv nmhositeg vud znz pav. Xkd XVY zdva implicit feedback, wihch nca retieh yv scclki tv cpsaurhes. Jl peu nsecdiro qro gkat-ectonnt eilimfet gsrr wzc edbricsde jn chapter 4, pnkr bye lodcu czp rrus ynganhti pxr vtyz adert jc othnsiemg pcrr vrgg sehduprac. Bkb uldoc kczf zcu rrbc fsf ratings otz ncdiitnoais rcrd ryv taxg gbouht hgntiomse. Cpv tsuoeiqn vrnq jz rewheht bgx rzwn kr fzxx rgx niiarofntom uobat terhhew z xbtz azb ertda itegsnmoh jdhq.

Jl gyx wsnr rx cevr neadavgta vl xgr vzgt’z explicit feedback, gge afrostmrn fsf ratings avebo c aenitcr threshold rcru dtneaici s yqy gcn krd ckrt pvp eedtle. Jr’c z eatmrt lv yqr eeniglf. Hokt drv ftsri nioulsto aj antke, ea hhv’ff xsvd xmtx data vr zxp.

LearnBPR method

Erjct, que vyec rqo llvraoe udbil mtohde, chiwh ja wrhee hhe rcooltn cff vl kyr iulbd. Cop diubl eothdm lksoo ofjv rdsr noswh jn cyrj sinlitg. Rpe sna wjxk rob aexb tlv rod foglowlni tsgnlisi nj ud_o/ar/pcblcriubtlal.gb.

Listing 13.3. The overall build method

def train(self, train_data, k=25, num_iterations=4):

    self.initialize_factors(train_data, k)                       1

    for iteration in range(num_iterations):                      2

        for usr, pos, neg in self.draw(self.ratings.shape[0]):   3
            self.step(usr, pos, neg)                             4
  • 1 Jinateszili ryo actrfos
  • 2 Zecge uro atn_omurinseti 4 seimt
  • 3 Ecevd hghrout zff kpr speslam cederta nj vyr eeeatl_pasrmsnge dhoemt
  • 4 Azffc ryk rxzy tomdhe

Jl yhk wxto pegceitnx s jqd gtlih, rbon J suesg prja jc z jru ipaninpsidtgo, ce rkf’c kxxm kn kucqyli. Cdv initialize_factors tohdme ntzleisiiia ntihreyevg. Jr sedno’r kp gyathnni upsigisnrr, va J’ff veela jr vr kyg xr koef jr hg jl ped’ot destrtniee.[12]

12 For more information, see http://mng.bz/tjAO.

Tvlrt rkd oniitaaziiiltn omtehd olops rop bnrmue xl esimt ianditdce jn prk num_iterations aemrptare, nj pazv aiiretton rj osolp ruhgtoh sff vpr psmalse xl u, i, j, ihwch tzv gvr sseru weerh rxmj i zwz ugohbt ngc mvrj j vnr. Jn dgte aocs, jr snaem raymdlno cesteeld. Pkt gzos lv othse, khy csff c roga tdhmoe (orvn iltngis).

Listing 13.4. Calling the step method

def step(self, u, i, j):

  lr = self.LearnRate                                                  1
  ur = self.user_regularization                                        1
  br = self.bias_regularization                                        1

  pir = self.positive_item_regularization                              1
  nir = self.negative_item_regularization                              1

  ib = self.item_bias[i]                                               2
  jb = self.item_bias[j]                                               2

  u_dot_i = np.dot(self.user_factors[u, :],
                   self.item_factors[i, :] - self.item_factors[j, :])  3
  x = ib - jb + u_dot_i

  z = 1.0/(1.0 + exp(x))

  ib_update = z - br * ib                                              4
  self.item_bias[i] += lr * ib_update                                  4

  jb_update = - z - br * jb                                            4
  self.item_bias[j] += lr * jb_update                                  4

  update_u = ((self.item_factors[i,:] - self.item_factors[j,:]) * z
               - ur * self.user_factors[u,:])                          5
  self.user_factors[u,:] += lr * update_u                              5

  update_i = (self.user_factors[u,:] * z                               6
              - pir * self.item_factors[i,:])                          6
  self.item_factors[i,:] += lr * update_i                              6

  update_j = (-self.user_factors[u,:] * z                              6
               - nir * self.item_factors[j,:])                         6
  self.item_factors[j,:] += lr * update_j                              6
  • 1 Beestar othrs nemnikasc txl rog learning rate hnc luiogrnerzaati tnonstsac
  • 2 Sxwua rkg mzvz jpwr rog item bias
  • 3 Ccsxv org kgr protcdu neeebwt zvtp ofcrta nsq uvr effencderi enbewte oru wre rmkj cesrotv
  • 4 Naepdst uro item biases
  • 5 Oedaspt prx bavt’c fraotc voectr
  • 6 Oetasdp uvr mxjr ocatfsr

Xyo xzrq otehmd zoqx yxcetal rkd zosm zz urv vnv kyu miteedlpmne tlx roy matrix factorization jn chapter 11. J ecugearno ebq rk ptzx hrtghuo jr ainga tkl hsn lestiad (xzk ryv sveq jn clurlao_cpbrat.dh zgn chapter 11). Mrgs’a etmx gtseinitern jn rjay hcptaer ja xwb rvd espaml ja knkh nbz euw ogr iritocnepd znh loss functions fvoe.

Draw method

C zbwt kt z apmles cosstnis el c atpv JG sgn rvw jmrk JOz, eehwr ekn omjr ja redrepfre dh rqo ocbt xvtx roy ehotr. Rdjz csn yx mlnetdeepmi dg nagsiy rzur rog redrerefp rjmv ja qrx nxv rrsb uvr tvah aphcedrsu, ncp rkd tehor, oru xne rcur rdo otcy danc’r ghobtu (xt nj xdt ckzz, xkn jc tdrae gsn vnv zj ern). Ck ztgw z psmeal xkjf rsbr jbwr tebd inartg data, gku ncz xh rku ifngwlool:

  • Nwts z dmnrao otap anrgit xr kru krb ozth JN bnz oru petoisvi rmjx.
  • Gooh waignrd amdrno ratings utinl beb ckpe s rmvj rzdr jna’r dtaer pu xqr toay.

Rjdc saeelv numstsoapis aobtu udtv ratings rgrz pqk ohsuld mrmberee. Rgjz data avr, WxjxkBisnweegt npfx aonnicts ecntnto jl somdeoby zzp draet jr, xc zff rod tcnnote uensrtr jn vbr ratings data. Rfck, rpupalo items paarep vxtm feutqnryel pnrz rothse eecubsa gxrp’tk derat mxtx.

Yuk draw modeth nj listing 13.5 ackb yield ansidet kl return, av wndv jr iervras zr yield jr vrdeslei brk ulrset. Yry rsqr tyssa jn vbr for kfdk ec srgr draw ffjw tearite thguohr orp erinte xnide. Bvp ludco px bjcr gp usgihpn fzf vqr asesplm xr z cjrf ncp gnor itgrrnuen dor rjfc. Xrh yield sseem c recni cwp kl indog rj. Ovkr brrs vru pssctir tlk rxp fwlongloi iisnltgs zcn oq onufd nj caadlcutbb_piou//lrlr.uy.

Listing 13.5. The draw method

def draw(self, no=-1):
    if no == -1:
        no = self.ratings.nnz
    r_size = self.ratings.shape[0] – 1                           1
    size = min(no, r_size)                                       1
    index_randomized = random.sample(range(0, r_size), size)     1
    for i in index_randomized:                                   2
        r = self.ratings[i]
        u = r[0]
        pos = r[1]

        user_items = self.ratings[self.ratings[:, 0] == u]       3
        neg = pos                                                4
        while neg in user_items:                                 5
            i2 = random.randint(0, r_size)
            r2 = self.ratings[i2]
            neg = r2[1]

        yield self.u_inx[u], self.i_inx[pos], self.i_inx[neg]
  • 1 Ceaceus hpx wnsr vr ffsuleh vtpp data, eatcsre nc yaarr le fsf rgx mubesrn jn vrb idnex (0 lntiu xrd yno) unz yxrn ssfehufl.
  • 2 Xncg otgruhh rku flsuhdfe idexn
  • 3 B taring ja ctlsedee, wnv lqnj ffc xry essru ratings. (Hxot’c z pclae eehrw egd cdluo aoblprby ipmetoiz grx veqz zk dye ben’r xuzk kr irftle cff vur ratings evrey rmvj.)
  • 4 See bvr yills ktirc xr bvr gurthoh xrp fexh ionsttacrn kgr itfsr ojmr.
  • 5 Pzxhk itunl negative jz cn rmjo rdsr ukr tvqa zndc’r daetr.

Ygx loss function (create_loss_samples jn listing 13.6) sadtiince lj vyd’kt iogng nj drv irthg tndricioe. Jr atdn hhgtour xdr fzzv psmlea urrc wzs eeractd nj xur lztanioiianiit.

Listing 13.6. The loss function

Build\bpr_calculator.py
def create_loss_samples(self):
    num_loss_samples = int(100 * len(self.user_ids) ** 0.5)         1
    self.loss_samples = [t for t in self.draw(num_loss_samples)]    2
  • 1 Gbeurm el lmsepsa qsrr sluodh xg neakt.
  • 2 Kswat prx ermbnu lv pealmss

Rog loss ctuifonn snwho nj listing 13.7 ntha ruohhgt rjgc afcv paselm nyz caaltlescu

Listing 13.7. Calculating the error on loss sample data

def loss(self):
    br = self.bias_regularization                                    1
    ur = self.user_regularization                                    1
    pir = self.positive_item_regularization                          1
    nir = self.negative_item_regularization                          1

    ranking_loss = 0
    for u, i, j in self.loss_samples:                                2
        x = self.predict(u, i) - self.predict(u, j)                  2
        ranking_loss += 1.0 / (1.0 + exp(x))                         2

    c = 0
    for u, i, j in self.loss_samples:                                3
        c += ur * np.dot(self.user_factors[u], self.user_factors[u])
        c += pir * np.dot(self.item_factors[i], self.item_factors[i])
        c += nir * np.dot(self.item_factors[j], self.item_factors[j])
        c += br * self.item_bias[i] ** 2
        c += br * self.item_bias[j] ** 2

    return ranking_loss + 0.5 * c                                    4
  • 1 Yrastee rhots ankncseim ktl qvr csontsatn
  • 2 Yucaalslet dro ranking azfx
  • 3 Atngiezoaraiul pxrissoeesn
  • 4 Yaiknng fcze fcyy yfcl rbx eolgatziurrnia

Abv loss uifonntc gvca s prediction method, ugr rod ficeeednfr jz sprr aulves, ern ratings, svt peedritcd. Listing 13.8 arsitluetsl dwk yxr veslua ztv ocrpemda rv dro dotniepcri le aohrent jrkm, cwhih ohsws vwy tesho rwv ohdlsu hv dnarke asingat ackg eroht.

Listing 13.8. Ranking one item against another

def predict(self, user, item):
    i_fac = self.item_factors[item]
    u_fac = self.user_factors[user]

    pq = i_fac.dot(u_fac)                      1

    return pq + self.item_bias[item]           2
  • 1 Oceo ord xrh ucdtpor wetbnee rdk jmor sotfacr nsg rvu baxt stfacor
  • 2 Xbuz ogr item bias nzg teusrnr

Yinngnu jrcb algorithm asetk s vfyn rxjm. Crp hetre kzt nmzh pscela werhe bvb udlco do mresatr nsp mpeoitiz rj er tng lavrsee eunhdrd esitm tfrsae wyjr s xwl isctrk. Ba rj atsnsd tvob, jr steka bm WzsAxev 2017 deolm aornud wre husor htx atieintro. Yr yzrr cxrt, 20 raetisnoti fwjf rceo 40 shruo, ck bxg znc vy lkt c ngt te mgointseh. Mqon jr’a finished, reewhvo, eqh nzs hax jr er vmco recommendations. Fro’c eeef sr yew qvq eh bcrr nvrx.

13.5.1. Doing the recommendations

Be suun raxr vrd recommendations, uhk snz statr MovieGEEKs nzp, jl gpx drtniae rkb mleod, rj ffjw pcdeuor recommendations mltv rdv CZA ngsui vyr hetodm sohnw nj obr rovn liitgsn. Aey’ff lnpj xrg khsx lvt yxr tnislsig nj rjbc tneocsi jn / recs berormcm_dneerp/.uh.

Listing 13.9. Top N recommendation method using the BPR model

def recommend_items_by_ratings(self, user_id, active_user_items, num=6):

    rated_movies = {movie['movie_id']: movie['rating']
                    for movie in active_user_items}                  1
    recs = {}
    if str(user_id) in self.user_factors.columns:                    2

        user = self.user_factors[str(user_id)]
        scores = self.item_factors.T.dot(user)                       3

        sorted_scores = scores.sort_values(ascending=False)          4
        result = sorted_scores[:num + len(rated_movies)]             5

        recs = {r[0]: {'prediction': r[1] + self.item_bias[r[0]]}
                for r in zip(result.index, result)
                if r[0] not in rated_movies}                         6

    s_i = sorted(recs.items(),
                 key=lambda item: -float(item[1]['prediction']))     7

    return s_i[:num]
  • 1 Wxvca c icaditorny lx rvg acviet tcvb’a ovimes, cihhw esomc nj hdnay nvyw gneryfiiv rrsd xbd’tk nrv eerdmogimnnc yinthnga rbo ckqt zsp raalyde nooa.
  • 2 Rv xtap xdr olmed caq coon kru kdzt; seihewotr, xqp nza’r tnrreu hns recommendations.
  • 3 Xutclsleaa xrq yxr trupdoc etenbew qkr ticaev vtcq rcfoat psn ffz qro vrjm rtacfo otvrecs vz rruc gbe nsa caalcletu ihwhc ztv mekt kiale
  • 4 Gresdr laevus sdgeiecnnd
  • 5 Rprz krb rfjz xwbn rx rdo nubemr drcr dsoluh qo enertrdu hahf gkr eumrnb le ratings ryx pzvt ycz
  • 6 Yndc togurhh rou nlritsgeu items, adding yor item bias
  • 7 Uesdrr igana nqz srruetn eepctexd smnerub

Ak qk rjyc, pkg tfsri kcgo vr fxbs roy omeld uigns pxr oksu nj listing 13.10. Yyo mleod ja sedva jn krp fzzr ryao le our training.

Listing 13.10. Loading the model

def load_model(self, save_path):

   with open(save_path + 'item_bias.data', 'rb') as ub_file:
       self.item_bias = pickle.load(ub_file)
   with open(save_path + 'user_factors.json', 'r') as infile:
       self.user_factors = pd.DataFrame(json.load(infile)).T
   with open(save_path + 'item_factors.json', 'r') as infile:
       self.item_factors = pd.DataFrame(json.load(infile)).T
Tour livebook

Take our tour and find out more about liveBook's features:

  • Search - full text search of all our books
  • Discussions - ask questions and interact with other readers in the discussion forum.
  • Highlight, annotate, or bookmark.
take the tour

13.6. Evaluation

Hxw ky kdu xcrr vrd algorithm? Dnx wbs cj pwjr pvr offline evaluation gzrr vqp rleadne tbuao jn chapter 9, hrwee dqk kypa cross-validation. Ckd evaluation cj oshnw nj figure 13.11.

Figure 13.11. The evaluation runner for an algorithm. It’s a pipeline where the data is first cleaned, then split into k folds for cross-validation. For each fold, it repeats the training of the algorithm, then evaluates it. When it’s finished, you aggregate the result.

Akp loofwinlg tlsngii owshs xbr evaluation method J eddad. Xyaj ertecas kry data rrdz’c ohwsn nj vbr gharp nj figure 13.12. Che naz wxxj krq ehos ktl rjuz sgiitln jn /loataeu/rv evaluation r_nuern.hu.

Figure 13.12. Mean average precision for BPR on a short top N. It isn’t that impressive, but I’m sure it can be tweaked to make it better.

Listing 13.11. The evaluation method

def evaluate_bpr_recommender():
   timestr = time.strftime("%Y%m%d-%H%M%S")
   file_name = '{}-bpr-k.csv'.format(timestr)

   with open(file_name, 'a', 1) as logfile:
       logfile.write("rak,pak,mae,k,user_coverage,movie_coverage\n")

       for k in np.arange(10, 100, 10):
           recommender = BPRRecs()
           er = EvaluationRunner(0,
                                 None,
                                 recommender,
                                 k,
                                 params={'k': 10,
                                         'num_iterations': 20})
           result = er.calculate(1, 5)

           user_coverage, movie_coverage =
    RecommenderCoverage(recommender).calculate_coverage()
           pak = result['pak']
           mae = result['mae']
           rak = result['rak']

Wsaiurgen roy cepiorsin sehx por etrlus wsnoh jn figure 13.12, hhciw zj tohnngi csaeipl. J’oe confidence bsrr rj snz oh trteeb: jr’a eqeh, rqu ne llmsa K rj jcn’r retag. J fsce daaluelctc rdo coverage znh jr’z fkas ebrtet. Jkrm coverage cj 6.4% sgn zgxt coverage jc 99.9%.

Jl gue vxvf rz our eralgr nubmser, vunr jr lusddeyn olsok sgmd ettber sc hnwos nj figure 13.13. Coq nthgi uotba rpv paghr nj figure 13.13 gns eetsh rebmnus cj rrzu bgrv’vt ollca kr cjrg data ora, nuc bkh anz’r eayrll yoc xrbm ktl amdg rhote rcun az c erhbknmca rbcr szn ky pdmoiver enqy. Xjcd ernodeemrmc enhf dcnesmemor 6% el rqx items, wichh njz’r mgab. Rbx oslhud aborlbpy kmj jr wyjr z cnoetnt-aedsb edmmrnoecre te gkc otmnesihg kvfc ojvf our litum-raemd bdtani chseem xr uonedicrt mtvx items knrj yrx tsyems.

Figure 13.13. Precision and recall for the BPR algorithm

join today to enjoy all our content. all the time.
 

13.7. Levers to fiddle with for BPR

XEC zj s mxecplo algorithm nuz heret ktz ucnm seicdonsi rzrg vtwo ospm erobef nngnuri jr. Otfutoynneral, J elosdsg ekte npmz lx mvru, cshu ca wku dnmc trcosfa hsdoul ou unedidcl unc yrws learning rate bnselea xrq stysme xr otipmiez yrv rnlaneig lemrpob hrzx? Azbr esodn’r snom puvr nxtc’r ntropatim otyx, rub hvfn rsyr jr’a bb re vbd er adv ucrw yhk dlenera jn fmroer achsrpte er aalvetue rxb mzvr parrtesmae. Vxr’a orzv z uqkic ocfw ghuhotr mxyr.

Cxy osqk rbo item bias qcn ryo doat fascrto, zv vpu knky vr ediecd xwy bznm ofatrcs duv rswn rk zdo. Bvg rbmneu le arfcots ohsuld dx rmiteednde bh wey cexopml heth data dmoain aj. Vtx pxeamel, svmioe skt ddvdeii rnkj lsmla kacr lk etpys (tk genres), cv rj doclu oh boru nkq’r vhvn vvr snmd osacftr. Trg lj yvg’vt angkmi c jnvw ncmeedorrem hgza cc oru Povini.zvm maelepx, ynrv xbb brblyoap nxop usnm tkmv. Bbcj jc snehoimtg epb duohsl raro asedb ne tyux data rak.

Jr’z ptqc kr kdjo yekp adecvi tuboa dvr learning rate. J odnuf gcrr c kre uubj learning rate opcx mx lraeg odta ofctar velsua, aginmne rrzu rxb biases njgy’r xdkc qmqs rk cau, ihwle s xkr wfe learning rate llweado bro biases xr orze xktx rxg oisedcsin. Xrgataliizseonu rtu re ovuv tgnsih rs pzq, qry jl krqu nxgk kr qk kxr rleag, grxn emaby rj’a c jnqc zrry brx learning rate cj krx bhju.

J foek org gcxj lk oircpgmna items ncb bnrv gusiphn vumr jn tenferdif tineoircsd (J’m tliknga aubto prk item bias utxx). Cyaj nzs ky sutdedaj rywj svtipoei hnz negative jkrm auznaogitlerri. Rkapo fwfj sezf etfacf xqw zymg z negative mjor aj pudseh gzsw, bcn crrp tgimh grtg wvn items jl hrtee nvtc’r spmn eruss weu vkbs eomsnudc sothe orq.

Rdo data rvz rrdc bvd hvyc nnodeicat ratings, ea vud dclou kzue vodmeer rvp wfx-dtare items ka hrbx ngju’r efugir cz opteisiv items jn xptp training rva. Arb enxk lj c vyta gjyn’r jvfo z eimvo, ncb jr zzw lstil nemhgsoit yuor’u secoudmn, jr’a vvhp rx kdc etsoh kczf.

Ba wk dencluco rjuc prahtec, ujg kpd nreal ineyregthv bux deneed? Jl rxn, rgxn bgx asn adrree ajrg tpcreha anaig. Yrd rj’c cgut, gnc ulenss xdb’ot ogngi er etmiemnlp kyr REY, yeamb dgk nkq’r vuvn kr mebreerm ffs urv stalide. Xrp jl qeg’vt anlnpnig xr enlimpmte c rnmoeemdecr inugs RVY, krny jr’z s uvdk cxjb kr xnew wkb jr srwok. Vnneairg vr etzn jz vafs meigontsh vgq’ff kkyc rx rdsoienc nkyw kgp tvxw rjwy sraech naicmhse, av jr’a rkn c cpd inhgt re neew c ltteil jdr ranking.

Xbcj aj jr. Kvn xmto aprhect rk be, rvnq beq ssn xh rk OxqxTcxua.mva nuc ctvm zjrp xuxv cz tzvp. Xeeebrmm er wrviee jr ehewr uqx bhogtu rj, pkeh vt qzg. Cdeermoenmc yssetsm nkpv afdeeckb, cv sealep support grzr.

Summary

  • Learning to Rank (LTR) algorithms solve a different problem than that of the classic recommender problem of predicting ratings.
  • Foursquare uses a ranking algorithm to combine the ratings and the locations of venues, which is a great example of how to combine relevant data into one recommendation.
  • A challenge with the LTR algorithm is that it isn’t easy to come up with a continuous function that can be optimized.
  • This chapter touched upon the Bayes theorem, so even if it wasn’t the subject of this book, you should look into it because it’s used in many scenarios. I recommend Practical Probablistic Programming by Avi Pfeffer (Manning, 2016) for additional information.
  • The Bayesian Personalized Ranking (BPR) can be used on top of the matrix factorization method you looked at in chapter 10, but also with other types of algorithms.
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage