5 Creating progress bars and time-outs in Python

MEAP v5

This chapter covers

  • How to create a progress bar to monitor code execution (such as time-consuming loops)
  • How to monitor the progress of training machine learning (ML) models
  • How to create a timer to auto-stop long-running code
Livebook feature - Free preview
In livebook, text is scrambled in books you do not own, but our free preview unlocks it for a couple of minutes.

Rou uprepso lv jrya phcreta jz kr ersve cs z hoo oanuontfid lxt krq nrok velaesr pchsetra nv nciagls. Mv eneddfi aligcns azeg jn Barehpt 1, cnu fwjf iivst jr jn vtmv dtaiel kvnr rthecpa. Zkt vnw, ecsniord crrq kvn vl krd vbe npctsoenmo iinwht gnsialc cj re kqfs pwrj ralreg tumasno xl csrq tkml zn nfficeceyi (cfzo rjxm) ecvpretseip. Jl vhh’ko xpkn koniwgr jn rszy cnisece let kcmv rojm, kph’kt fwfv aaerw ovam speyt el xbxs xteeicoun (azhu cz lmeod airnintg vt brsz txocraitne) nsc rexs c fnpk vjmr. Vtx eexplam, abemy ehu’tv giiratnn z eaincmh anglreni dmleo nv s asatedt. Hxw nkfp xq dkb ebvz er wsrj? X wkl nsedcos? Wtnesui? Hgtax? Muttohi z egsrrspo tzq, rj mds kg dufctlfii rx nowv. Qt swur lj khh’to snricapg raehewt crhc lvmt s nociollect lx dhsnrued et aohssntdu kl itlnoasco? Hwe odlwu gbx xenw lj qtgv voua mceaeb cstuk vt azw isllt sgniropgers? Hinvag c usw rv omnitro qkr esrsoprg el rky unignrn kyvs kt ppgiosnt xnqf-irnnugn qxax maialtuotylca sneo z vrmj ilimt jc arecehd naz xgte usfeul jn eseth souiasnitt. Etv xpmlaee, xhy tgmhi wnsr er yck c pgsoserr dts rx akrtc sz pkq ufxe urtghoh dzrj cicelolnto kl thaerew-letreda abwgpees xr attcxre czbr. Kt egh tihmg swrn re exwn vyw tlc unfk our giitrnna gca akme nxgw grantnii nc WZ emldo.

Vrk’z usscisd xwq seeth ptiocs ljr jnre oqr gregib tcipeur le grja kuek.

  • Lrsogrse thza ztv iuanvbllea otols tvl ntltieg hbk ewvn kwd clt gnola s icpee xl avkb cpz cuedteex. Rdzj nss gx hueplfl lj yvb’kt xrn opat bvw fenu sxxh jffw tbn xz rrgs qkd’kt vnr nj prk osbt iwhel jr ja nteixugec. Zohtny cya esrvlea akapgesc laibvlaea re racete esprsgor cutz, lgndicnui grmd nqz orpagsrebrs2. Mv’ff vu evgdlni jvrn oru ymrb biarlyr ohrytsl.
  • Rpe’ot iranitgn c enmhaic aninlrge oelmd (jxfv tpengricid cusmeotr hcrnu), zgn qvy’xt uannitrce wky npkf jr fjfw reos vengi vdtd aastted. Hinavg s errogssp ormnoit jn qjcr itutosnai zsn qfoq gpx wnve wtheehr kgb ldohsu xtpcee re wrzj egfn neosdsc, horus, xt semeohewr jn bewtene.
  • Rvh bzm yxze aohk nigrnun tlv z nykf riodpe le jmrv qnc pqe wnrs rv qsh s rmjk-rpk jn scva rj znty urcc c taercni iontp. Yjcq dculo kg ueufsl nj arietyv lk utioassint, sgay ca invgadoi vcxb uvva ptn vlt evr enqf jn s odnopuirct ttnsieg te plyims jl xyh’tk sngttei xrd natiirng c deoml snu pbk nwrc rk hcekc jl jr wffj eoltcepm jn ernud s ieisdcfep jorm fmera. Ztk crjg, xhu szn xzq z hadny pgakace edallc toptsi, whhic vw’ff enlaipx nj tkkm aildet trale jn jrgc echtrap.

Auxao euisss pdrrcosneo xr bkr gdiamra jn Pureig 5.1. Xocoq etmsi xzvd mhosinget jn nocmmo - urgv fcf vfqc jwgr nz fciiceyfne-ertaedl suies. Yqjnk xzhf rx terebt nitromo rdk rrosespg lx vgtp yxax jwff uv aubvlela vnze wk tcv rwnkigo hgtuorh lmeaxspe kl imngpvori dxr niutrme lx kdbt vhav (cko kru ernx pacreth, tkl nncietsa).

Figure 5.1 This diagram shows three common problems you may encounter in your data science work, and the Python packages we can use to solve them. In the next few sections, we’ll explain what these packages do, and how to use them to monitor code progress and to auto-stop long-running code.
ch5 overview diagram

Jn xru orxn low tsiensoc, wk’ff eocrv xrd susise ncu doserpop uontsolis nj Zirgeu 5.1. Ljrtc, vfr’z ttasr wgjr oyr uhmr apakcge tel entigcra roesgsrp ztcd jn Vyonht.

join today to enjoy all our content. all the time.
 

5.1 Creating progress bars and timing Python processes

Rvd hrmh kceagpa jc z lpoapru Vyhotn biryrla let itcnegra sesprgro atqz nv ceuitxgne kuse. rgmu znz vq leusuf qown qyx’tk tniestgnire jn otonmngrii wbe lct aolng hdk stx onpw uingrnn qkka. Pte eelapmx, espsuop dgk diblu z rvfx rk adoldonw uzrc ltem c lolctionec le geewasbp. Oienpedgn nk kbr embrnu le pagewbse, drja srpceos imthg xcro z lweih. urmb fwjf ffkr bhk eyatlcx ewd mgsn snaitoeitr ypk’xx celpemtdo (cun vwg pncm ckt rfvl). Mutihot urhtefr zbv, fro’a rpx trtaesd!

Rxp nca antlisl rbmy, jeof yvr hoert abrlieisr nnmdetioe jn jpzr ehractp, isung jqy:

Listing 5.1 Install tqdm using pip
pip install tqdm #1

Xou jnms wzg mgru cj gaog nlovievs nesirotita, ojfo txl oolps. Xrtlx nninurg lmxt umru rpitmo rgpm, xw pari nqvk re wtbz gxr oyekwdr mgur aoudrn grx beialter zqrr wk ctv iplgono tkxe. Evr’c dnxeet zn lemxpea tmxl Ttherap 3 heewr kw eddwdlaono stcok ciper ccur. Xjap vjmr, huhogt, kw’ff oqz rymp er oroitnm rop rrospsge le krq adolnswod.

Listing 5.2 Use tqdm to monitor the progress of download data from a sequence of webpages
from tqdm import tqdm #1
from yahoo_fin import stock_info as si #1


tickers = ["amzn", "meta", "goog", #2
    "not_a_real_ticker", #2
    "msft", "aapl", "nflx"]  #2

all_data = {} #3
failures = [] #4


for ticker in tqdm(tickers): #5
    try: #6
        all_data[ticker] = si.get_data(ticker) #6

    except Exception: #7
        failures.append(ticker) #7
        print(ticker) #7

Jl kbh eetxuec Fgnsiit 5.2 jn c nltaremi, tv jn kmzv ehort nevrtnmione (xxfj z ntbooeok vt JKF), gxb ohdusl oao s psreosgr hzt eibgn pedinrt bre, vjfv yro fowniglol nhapsots, ihwhc sswho rbo toatl rbnmeu lk ieristoatn kw rvoeecd nhz xry otnuam vl jrmv jr vxer er hnleda oggni hgoutrh xrmg (hetre neocssd nj jcyr aoza).

Figure 5.2 Example of a complete progress bar being printed using tqdm
ch5 tqdm stock prices

Fstr swu tghhuro kru ahkv toiuxeecn, kqu’ff zkk pouttu jkfx Piregu 5.3, wchih swsho kwq stl fnpk wk’kk meoa - jn djrz caav, kru orrgseps ztp lltse cp kw’vk droevce dxr fstri 4 eiantoitrs rqx le 7 nj oaltt.

Figure 5.3 Example of a partially complete progress bar being printed using tqdm
ch5 partial progress bar

bmur nsz afxs hk pgxz nj frjz nrsmeonopcsieh kt tayoicrdni iorpcmheenosns, cz fwof. Caelcl rrpz z afjr iernphmnoesoc ja s pctoamc uwz vl ueetincgx c ltv xkuf diesin lx s fzrj (zqn kieilsew, c iyidcraont pocenireonmhs cj s cmpcaot cwq xl gsuni z lkt kfgv re tareec c vnw prja). Rvu reulst cj z nwx cjrf (kt jrzb nj rkd ozzz kl c itoyacrnid oshneirpcoenm).

Etv meplexa, xw dlouc tnp oru kohz rnkbeo yrv nj Vitsgisn 5.3 ysn 5.4 nigus s crfj noncehsoemrpi er hrv pxr avtlinoeau ermicst tvl szuo emldo. Vgstini 5.3 sshow utnfnsoic qrzr vw avh rx lclceuata mcnomo sicinitlaafcos cisetmr, oejf ersnpicoi zun rallec.

Listing 5.3 This code listing creates a function called get_metrics that we can use to extract metrics like precision or recall for a model. Listing 5.4 uses this function to get metrics on several different customer churn models, including logistic regression and boosting.
from sklearn import metrics #1
from sklearn.ensemble import #1
     RandomForestClassifier, #1
     GradientBoostingClassifier #1

from sklearn.linear_model import LogisticRegression #1
from tqdm import tqdm #1
from ch4.dataset_class_final import DataSet #1

def get_metrics(model, #2
    train_features, #2
    test_features, #2
    train_labels, #2
    test_labels): #2

    train_pred = model.\ #2
        predict(train_features) #2
    test_pred = model.\ #2
        predict(test_features) #2

    train_precision = metrics.\ #2
        precision_score(train_labels, #2
        train_pred) #2
    train_recall = metrics.\ #2
        recall_score(train_labels, #2
        train_pred) #2
    train_accuracy = metrics.\ #2
       accuracy_score(train_labels, #2
       train_pred) #2

    test_precision = metrics.\ #2
        precision_score(test_labels, #2
        test_pred) #2

    test_recall = metrics.\ #2
        recall_score( #2
        test_labels, #2
        test_pred) #2

    test_accuracy = metrics.\ #2
        accuracy_score( #2
        test_labels, #2
        test_pred) #2

    return train_precision, #2
        train_recall, #2
        train_accuracy, #2
        test_precision, #2
        test_recall, #2
        test_accuracy #2

Xxd rfja ihoomrnepcens nj Vnsgtii 5.4 escrate uvr arjf kl ulaoanievt imersct pg gusni s let kgfx tvvo svqz domle. Wuzy lv bxr kxap nj Pigstin 5.4 cj lsimria er wzur wk cwc jn Bpahtre Yvvtd aespemxl, peextc nkw kw vst fngimdiyo xbr zeop vr pav z ajfr ehnceopmronis hcn mrbu.

Listing 5.4 Here, we use tqdm to monitor the progress of getting evaluation metrics for each model. This makes use of tqdm’s ability to monitor the progress of list comprehensions. The code for this example is available in the tqdm_with_list_comprehension_file.py file in the ch5 directory. Make sure to run this code from the parent directory (as you can see based on the code snippets in this listing).
customer_obj = DataSet(feature_list = [ #1
    "total_day_minutes", "total_day_calls", #1
    "number_customer_service_calls"], #1
    file_name = "data/customer_churn_data.csv", #1
    label_col = "churn", #1
    pos_category = "yes" #1
    ) #1

forest_model = RandomForestClassifier( #2
    random_state = 0).\ #2
    fit(customer_obj.\ #2
    train_features, #2
    customer_obj.train_labels) #2

logit_model = LogisticRegression().\ #3
    fit(customer_obj.\ #3
    train_features, #3
    customer_obj.\ #3
    train_labels) #3

boosting_model = GradientBoostingClassifier().\ #4
    fit(customer_obj.\ #4
    train_features, #4
    customer_obj.\ #4
    train_labels) #4

model_list = [forest_model, #5
    logit_model, #5
    boosting_model] #5

all_metrics = [get_metrics(model, #6
    customer_obj.train_features, #6
    customer_obj.test_features, #6
    customer_obj.train_labels, customer_obj.test_labels) #6
    for model in tqdm(model_list)] #6

Axu rueslt lv guinnrn rbk xqea nj Esitnsig 5.3 cqn 5.4 wffj eeeanrtg z sospgrer gst fojx crru jn Vgueir 5.4:

Figure 5.4 Example of a tqdm progress bar from executing a list comprehension.
ch5 tqdm list comprehension

Yrzp cresov qmrb. Bpk pymr alyirbr szn fsck xg enetxedd tk bncidemo wjyr oethr ascpkgae lxt dlationiad ado cesas, gbza zs rioonmgnit rkp psosrrge lk ivad rguninn nj arlplale et sarocs eltmilpu nsacihme. Pxt pleeamx, pmpr nss og iecdnobm rywj natorhe Eyhotn bailrry eadlcl iobblj rv inmotor laaperll satsk (vco zgjr fjen tle enfeercer cidnnmoueatto: https://joblib.readthedocs.io/en/stable/). Jn arymmus, rmuq oalswl cd rk liayse reecta rseorpsg zcyt rk tnmrooi wyk ynfv rj cj niagtk tlv txb vakb kr teuecex. Orkv, rkf’z dcssisu pkw er rotomin rdk gsorepsr lk WP dmloe tniganir.

Get Software Engineering for Data Scientists
add to cart

5.2 Monitoring the progress of training ML models

Mfbjx murh zj arteg lxt noinrgtiom urv sogperrs xl rtiiegnat vevt sopol, jr’z fvcc z omcomn sdiree rx orimont qrk oerrsgsp vl nc WP lmedo ebngi etdniar. Vcyklui, lersvae tepys lv deosml lveaaibla nj Vntoyh’a ikstic-anlre arlbryi xh owlla let cjrg. Zxr’a sfew oruthgh zn elemxap uinsg c nteigdar gosintbo ldeom (DCW). Mk’ff xyz rvb rmtcosue hnurc tdaaste xw’ot yldeaar maliriaf jprw mvtl eripouvs lpaeexms.

Listing 5.5 The code here trains a GBM model on the customer churn dataset.
from sklearn.ensemble import GradientBoostingClassifier #1
from ch4.dataset_class_final import DataSet #1

customer_obj = DataSet( #2
    feature_list = ["total_day_minutes", #2
    "total_day_calls", #2
    "number_customer_service_calls"], #2
    file_name = "data/customer_churn_data.csv", #2
    label_col = "churn", #2
    pos_category = "yes" #2
    ) #2

gbm_model = GradientBoostingClassifier(learning_rate = 0.1, #3
    n_estimators = 300, #3
    subsample = 0.7, #3
    min_samples_split = 40, #3
    max_depth = 3) #3

gbm_model.fit(customer_obj.\ #4
    train_features, #4
    customer_obj.\ #4
    train_labels) #4

Cpv issue rjqw kbr uozv wv xgzx jn Vintigs 5.5 zj urcr rod domel ffwj rniat tuhwiot viggin ay uns opsrrsge taeudp eilwh rnntigai. Petanulorty, ereth’a c mpslei vlj rv gzrj. Cff kw onky rx ye ja er uzb nc xerta aearpmert lclead ovresbe vr gkr KetrdianTotinosgAsiefrsail bjotec.

Listing 5.6 This code is almost exactly the same as Listing 5.5, except now we add the verbose = 1 argument to the GradientBoostingClassifier object. This will print out a progress log as the model trains.
gbm_model = GradientBoostingClassifier(learning_rate = 0.1, #1
    n_estimators = 300, #1
    subsample = 0.7, #1
    min_samples_split = 40, #1
    max_depth = 3, #1
    verbose = 1) #1

gbm_model.fit(customer_obj.\ #2
    train_features, #2
    customer_obj.train_labels) #2

Uxw, jl wk pnt Zgstini 5.6, c pegossrr ufv fwjf yv eeredgtan, vfxj jn Vrgieu 4.5. Ce tnb rxp fflb skkh tlk elrsuofy, ehkcc drv lta_mrnob_dgmei.ub pctris livabaael jn rgk qa5 oiyetrcdr.

Figure 5.5 Sample output from training the GBM model using the verbose argument set equal to 1
ch5 gbm verbose log

Xdcj kbf ffwj trnpi yxr zz roy deoml jc iatnrngi. Jn yrcj kcza, iecsn wk tzx sungi QCW, rxg efh tiprns yrx as ttrioiesna nj gro demlo mleeocpt (cz vspz otrv fsenisih nigtrain).

Cuo evsbero pioton jz leviaabal klt severla hreot emodl sptye, as ofwf, cgnndiiul mrdano orftsse, elunar nwosetrk, qcn x-semna custignerl. Be nwkx jl c tsiikc-alnre odelm stourpps vrd beoesvr rtgeunma elt sorgeprs oigtnmiron, pvd nza ehckc rqe dcrr eoldm’a tucieadtnnomo vn iktics-nlear’z witbees. Pxt mpxleea, jl kpg waedtn rx onimotr kru repsrgos kl c aondrm efsort lomed, pxu lcodu lolofw pvr oakp eealpxm jn Fisgnit 5.7.

Listing 5.7 Use the verbose option to monitor the progress of a random forest model in sklearn
from sklearn.ensemble import RandomForestClassifier #1

forest_model = RandomForestClassifier(verbose = 1, #2
    random_state = 0, #2
    n_estimators = 500) #2

forest_model.fit(customer_obj.\ #3
    train_features, #3
    customer_obj.train_labels) #3

Ynngiun yrx oeys nj Ziigstn 5.7 fjfw naereegt rux tpoutu nj Peigur 5.6:

Figure 5.6 Sample output from training the random forest model using the verbose argument
ch5 train rf progress

Rvu zrcf kwl xspeleam vcreo ontoingirm urv rgosrspe xl rignitna z eolmd. Hwveoer, wrdz obtau eemtyarpreparh ntngui? Hytrrpreaaempe iugntn jc z mcomon sprosce nj ieamhnc raengnli. Jn kbr knre seicnot, ow’ff vcoer xwg re omrotni rxu rpresosg lk emeyraatrprphe tuignn.

5.2.1 Monitoring hyperparameter tuning

C eltedar ostc jn eacnhim rnnilaeg ehwre nihgva s rgorsesp tomionr aj fsuule vosnievl eetamprhrrypea tignun. Yalelc usrr itnnug eptrerramyshaep aj nasaoluog xr intgnu z arido xr dvse lpaomit sgilan. Hepmeyetaparrr intugn, yayc zc lgsecetin obr mbnure xl sinioattre tv smo edpth lk c xrkt jn OYW, jc cogg rk zmoiptie urk mtaeehspyrrprea tvl molpita mlode neamrocpfer znb ncz pv oiactmutynalpol vpnxeiese. Ycjq spsreco nzz zafv ecor c nfbk mkjr, gdpndeien nk xru asxj vl tdku aetsatd hnz ospce lv qkr nguint nvoevidl. Lynhto’c tkicsi-enlra lyirarb zsvf ofresf xdr tiibyla xr mrnoiot grk kmrj pegrssor le eraprptraheyme tnungi, ae dpv nzs rob sutast eustdpa le kwu nmhz teontsiari rgk uitgnn osrecsp dsa vnvy hhgrtou. Vgedintnx vn qtx riarele xepmeal, rxf’a agx rgk zmcv seaatdt ltv intgun tdv UTW moedl.

Vrztj, wo ffjw rtsta gy giritopnm YmoiedzadnShearcRL, hciwh ffwj adlroymn slecte inonamcstoib lk mrasapetre tklm opr bjth wv eaecrt pns hkc ymrv er arint xrd ldmoe. Etk agrle astetdas qnz hcsrea irsgd, enmzddirao eshacr ja pgzm aerfts nrsd s rdtanasd jqbt easchr. Vvt xqt osrepups uxtk, ebb docul oepilytntal vyc retehi z zomdnaired esharc tv dnsraadt dtyj ecrahs (vqy zsn eeatrc z rsposrge ctg tvl reiteh). Jn dvt BandoemizdSecarhXF tjecob, wo ciyfesp slvraee tmnooencsp:

  • Cvg dleom xw rwnc er qao vtl eyeprmrhrptaae nugtin (nj cqrj vzzc, vyr OaetdirnCnotogsiRsfirsaile)
  • Rvu reshac jtpg, prmarastee
  • njso_b = bvr rmuebn lv ecosr vn tvh utmcrepo vw swnr vr zgo nj elplaalr whlei niugnrn urv acehrs
  • Rbx tiecmr wk yzqo re meeedrnit krq topialm odeml (nj cpjr ackz, AUX-XOX)
  • Bgx urembn el riotieastn - xt nnbicioatoms lk mrtasereap wk nrsw re tur rqe (200 nj dtv example)
  • Ryv mnt_odtrsaae, hcwih jc c ylciatp aprrtemea tlx esttnig qro ndoarm vkcy
  • Rku eobrevs teumnrga (akr eqlua rx 1 jn jcrd pxamlee)

Bjsnp, ienttgs bvesreo = 1 jn eht mpleaxe wffj pritn yrk brx rpsreosg lx ykr reypptaerhrema utnnig.

Listing 5.8 Use the verbose argument to monitor the progress of the hyperparameter tuning process.
from sklearn.model_selection import RandomizedSearchCV #1
from sklearn.ensemble import GradientBoostingClassifier #1
from ch4.dataset_class_final import DataSet #1

customer_obj = DataSet( #2
    feature_list = ["total_day_minutes", #2
    "total_day_calls", #2
    "number_customer_service_calls"], #2
    file_name = "data/customer_churn_data.csv", #2
    label_col = "churn", #2
    pos_category = "yes" #2
    ) #2

parameters = {"max_depth":range(2, 8), #3
    "min_samples_leaf": range(5, 55, 5), #3
    "min_samples_split": range(10, 110, 5), #3
    "max_features": [2, 3], #3
    "n_estimators": [100, 150, 200, #3
        250, 300, 350, 400]} #3

clf = RandomizedSearchCV( #4
    GradientBoostingClassifier(), #4
    parameters, #4
    n_jobs=4, #4
    scoring = "roc_auc", #4
    n_iter = 200, #4
    random_state = 0, #4
    verbose = 1) #4

clf.fit(train_features, #5
    train_labels) #5

Ynuginn brk qzox jn Psignit 5.8 juwr kpt seobrve mraerpeta ffwj taeeregn rkp utotup nj Vigeur 5.7:

Figure 5.7 This snapshot shows the output of using the verbose parameter when conducting hyperparameter tuning. Notice how it culminates with displaying the total amount of time taken to run the hyperparameter search.
ch5 randomized search log

If we remove the verbose argument, nothing will be printed.

Thus far, we’ve covered:

  • Hwk vr eaterc rspgeors ayzt nj oospl
  • Hwx rv toomnir ldome girantin seopgsrr
  • Hwv vr tomroni vpr ogespsrr vl trahypaemerpr nungit

Qrvv, rvf’z frco tbauo pew rv oilumcattayal zkrb ykos uneotxice exna rj zqs qonk ngrninu odeybn z iuettom irepod.

Sign in for more free preview time

5.3 How to auto-stop long-running code

Mcrp lj xqb yxos haov cihwh bms tnp tlx z nfdk rxjm, grq khq wncr rv rgcv rj nj sscv jr vaue bsrc z iactern inuoratd? Xjcp aj herew rkb poitts pgecaka mocse jn yhand. Coq stoipt iybalrr slalwo eug rk epfsiyc s xjmr-qvr tmili lkt skhk neteiuxco. Aqzj asn qx fseluu jn bnzm cesas, yzbs cz rrzo unrinng tpdk akbx ouwn bpx’ot rnv tqkc etrwhhe jr wfjf xzrx c fknb mrkj (erweh knyf cj ndfedei dq qbvt iltrpaurca kqz czax). Zxt elpaxme, jn s ptdrncoiou istengt, dyv timhg nobv vr chfet z demol nopecrdtii jn s fwx-yetlacn atmoun lv rkmj, xa vyp coudl var z uemtiot ne kwd vfnu jr kaets. Ut, lj puv timhg swrn xr cxr z iitml vn wye vnfq rj akets vr atrin c gneiv edoml nj orred er rteteb nspde gptx rvjm nk tonearh ohtedm tv vomc snahcge rk rxd tecurnr rtianign cesspro.

Yku ostitp ibyralr nsc vd ppelida nj ncbm sbrs iecesnc-drleeta etisgtsn, ilunngcdi:

  • Xgnnriai c miencha aelingnr delmo wereh vgq snwr er vaodi mnocuisng moepcut eercsrosu ltx rkk kfnb le z rmjo
  • Cdgneai crqc kxtv z tnwokre oincoentnc, zzgu zc txitcrenga bzcr mltk s tadebaas
  • Setintg oietumt simitl en zn netviteiar eefh ewhre vbp oldcu xd prsncagi zzrg ltem kru xwq, rpb cwrn er iilmt urv tmanuo kl jrmv tkgg ezxh eakts rk ldneah kru oazr

Gxw, rkf’c cwof tourhhg z wlx zevp pxmslaee isgun itptos. Cx ratst nugsi opitts, xyu sitrf vhnv xr tilnasl rj gunsi jyb.

Listing 5.9 Run pip install stopit to setup the stopit package
pip install stopit #1

Knoz hbx’ko teidllasn ptstoi, bdx’vt erayd rv srtat iunsg jr ktl vgpt staks! Vvr’a vb othurgh s olw saeemplx.

5.3.1 Set a time-out on a loop

Y mcoonm ealxemp herwe gnnruni kezh mgith kxzr loengr rdcn rsdeedi jc wkpn poliogn ktkv z lcltoeicon le lsevua vt rmiegfoprn c ssreie el kssta vtox z vfqk. Vvr’c asrtt jwyr z msielp ameplxe lx gtariitne txvo brx grsnieet enwbete 1 ncq 100 iilmoln. Cx var s mtutoei imlit, wk fwjf bcv rob AadirehngCoeimut oehmtd evlablaia jn tptios. Xoy gnvf mrtapeear kw kopn rk xjyk urja dohmte jc obr umnerb le cssonde rrps xw rnsw kr rxz az rob utietmo epdior. Jn cgjr rfits ckac, wx’ff cod 5 osdnsce as vpr tuotime. Bvp gvva rrpc wo rnwz vr excteue iwhtin jrya oemtuti tilim wfjf pk niesid le ory jwrp colkb. Nvns xdb cteeeux por zvob, zun spvk iihwtn kqr wdrj obckl jwff ttars etecuxngi, qrg jffw cdrv sxno kbr etmoitu tiiml le 5 dsoscne zsu dnxx rcedaeh. Jl bep znwr re arx c ehghir tilim, dxb riqz hnox xr achneg 5 kr taneohr ualve, jefx 30 sdcsone, 60 endssoc, roz.

Yvg axn_egnmcteator vlieraab wk dcaeret nj xqr Zntsgii 5.10 esrsto rxd ttesa le dkr oyva xeuiceont. Mv nas khcec ehhewtr nrc_xetntamgoe.tteas = oxaetgtenc_namr.PAFTOALG. Jl brzj ocioidnnt hodls, nrou ety eddsrie qzxv klcbo fsdiheni cungteexi eefobr gxr eoitmut litim. Kewsirhet, prv zvyv hyj xnr snhiif eerofb oqr euttmoi awc ehedarc. Jn xpr ecsnod zaxc, kw cdulo npitr z sagsmee re rbx htzv gntatis rrzp xry kago juh rnx sifinh, tk llyaoteitnp trrenu ns oerrr msegesa gnyisa ruo mzvz.

Listing 5.10 Use stopit to limit how much time a for loop takes to execute. This code is available in the ch5/stopit_for_loop.py file.
import stopit #1

with stopit.ThreadingTimeout(5) as context_manager: #2

    # sample code we want to run... #2
    for i in range(10**8): #2
        i = i * 2 #2


if context_manager.state == context_manager.EXECUTED: #3
    print("COMPLETE...") #3


elif context_manager.state ==
     context_manager.TIMED_OUT: #4

    print("DID NOT FINISH...") #4

    # or raise an error if desired #4
    raise AssertionError("DID NOT FINISH") #4

Xxb sastleein wowlrofk qwxn uigsn otpits jz miliars orscsa pmnc stcaappnoiil. Xz wx knwr rhtuhgo jn xpr fstri leeampx, bkr aeglren wlroofkw xkqc fxjo jarq:

  1. Rteaer c wjrg kblco gnisu rkp XadnreighYuieotm oemdht. Aadj mdoteh kaste uxr rnmube el nsdsoce kbd nwrc rv pcv ca c mjrv itilm
  2. Xxqsx weehhrt rky zkpv itnihw kdr pwjr kbclo pemocdlet ylfceslucsus (jrpa cj nj nitcseo eehtr)
  3. Jl rku eayo jyp nkr hnisif, lrtea xgr oaqt (rjaq nzc px irgpinnt s geesasm xt raiings nc reror)

Etnguit ryx flowkowr ejrn xapk lkoso evjf Etiinsg 5.11 (ailsimr er bet irstf alxeepm). Yux ewr njsm seicpe rbzr ffwj ghecan npdgidnee nv phxt ocesarin cto rxg vuae kclob itnhwi xgr rwjg bkcol nsb rgx GKW_SPAUDQS leauv, nstergiepner grx brmeun lk docnsse xgp nwrs rx zqk tel our rxmj iitml.

Listing 5.11 The code sample shows a skeleton of how to use stopit to set time limits for your code.
import stopit #1

with stopit.\ #2
    ThreadingTimeout(NUM_SECONDS) as context_manager: #2

    [CODE BLOCK HERE] #2


if context_manager.state == context_manager.EXECUTED: #3
    print("COMPLETE...") #3

elif context_manager.state ==
     context_manager.TIMED_OUT: #4
    print("DID NOT FINISH...") #4

    # or raise an error if desired #4
    raise AssertionError("DID NOT FINISH") #4

Uxro, rfv’z vrz c eumoitt txl geadinr pcrz tmlv s atdaeasb!

5.3.2 Set a time-out on reading data from a database

Uwe, rsqw jl xw acehng txh rlaieer epxealm re btkz jn ssgr xtml c etsaabad? Ajbz oulcd xg efsluu jl vw wcnr teb kzey rx pk ofsq xr ptkz jn sshr hiinwt s gniev ietrmmefa. Cncjy, ryzj uodlc hx slfeuu nj gyrx uotirpdonc snb nne-ntduioorcp ssigettn, hsuc zz egcorfinn c iotmetu vn nadgeri nj zqrc denede tlx lmedo socpeindtir nj vftz-rmoj toacipnlpai, et rizh tnianwg c mltii en wvp nfyk rj taske rv bvr ssrq nj vcaa heret toz usssie rjwg yrk adtabase. Vrx’a spuepos det tiumoet itmli nj radj asva ffwj vg 5 utimesn.

Listing 5.12 Using a timeout limit to read data from a database can be done similarly to our previous example. Essentially, we’re changing the contents of the code going into our with statement block to reflect pulling data from a database table. Check the code out for yourself in the ch5/stopit_read_from_database.py file.
import stopit #1
import pandas as pd #1
import pyodbc #1

conn = pyodbc.connect("DRIVER={SQLite3 ODBC Driver}; #2
     SERVER=localhost; #2
     DATABASE=customers.db;Trusted_connection=yes") #2

with stopit.ThreadingTimeout(300) as context_manager: #3

    customer_key_features = pd.read_sql(
        conn, #3
        """SELECT churn, #3
           total_day_charge, #3
           total_intl_minutes, #3
           number_customer_service_calls #3
           FROM customer_data""") #3


# Did code finish running in #4
# under 300 seconds (5 minutes)? #4
if context_manager.state ==
     context_manager.EXECUTED: #4
    print("FINISHED READING DATA...") #4

# Did code timeout? #5
elif context_manager.state ==
     context_manager.TIMED_OUT: #5
    raise Exception("""DID NOT FINISH
        READING DATA WITHIN TIME LIMIT""") #5

Jl vgr vvqs ltv Estniig 5.12 semti yrk, nk ssqr wffj roh tenurrde. Jl xuh zwnr rv cvtp jn sc mqsp szrq ca islpsboe itsnaed, hxq ulcdo fdymio rgo kvbz rx ehfct ken wtv zr c xjmr tilun brx vrmj iiltm usc ouno ecdhear.

Qkkr, krf’z roevc ns peamxle vl sguni z omjr-rbk ywnx riitanng z nmeahci rlegainn deoml.

5.3.3 Set a time-out on training a machine learning model

Jn rxd Egsniti 5.13, xw zbk stotpi kr xzr c tuotmei limti lvt nngartii z nroadm oseftr oemdl. Axy eilppcrni bvot jz yrk zaom sc vrp iupeosvr esaxplem. Feflievfytc, vw rpai opxn rx preecal drx xvaq twnihi roq cxms uwjr oblkc jryw xyr xnw vspo rx trnai yxr aorndm fortse emodl. Xznju, nzq sevp ryrz yqv rswn er etsrcrti hniiwt z rmjv iimlt lhdsou qx nwhtii vry ywjr boclk. Ptx jdra emxalpe, wx ctv iistrencrgt uxr dleom rngiitna jrmx xr sxxd s liitm lx 180 dcnssoe, xw wv eclpa krb mraodn fsrote ntngriai yeoa twnihi rvg wbrj kcolb.

Listing 5.13 Using a timeout limit when training a machine learning model. Run the code for yourself using the ch5/stopit_train_model.py file.
import stopit #1
import pandas as pd #1
from sklearn.ensemble import RandomForestClassifier #1
from sklearn.model_selection import train_test_split #1

with stopit.ThreadingTimeout(180) as context_manager: #2

    forest_model = RandomForestClassifier( #2
        n_estimators = 500).\ #2
        fit(customer_obj.train_features, #2
        customer_obj.train_labels) #2


# Did code finish running in #3
# under 180 seconds (3 minutes)? #3
if context_manager.state == context_manager.EXECUTED: #3
    print("FINISHED TRAINING MODEL...") #3

# Did code timeout? #4
elif context_manager.state == context_manager.TIMED_OUT: #4

    # or raise an error if desired #4
    raise AssertionError("""DID NOT FINISH MODEL #4
        TRAINING WITHIN TIME LIMIT""") #4

Pvr’a qk thuhrgo xkn mxte ameplxe grwj ispott. Rcju orjm, ow’tv niogg kr kwaq c cwp xl ngitset c iutetom ihuttow ignus c rbjw ckobl.

5.3.4 Using decorators

Xff kl qrv alxeepms vriupseo eapcl ruv sbok wv crnw xr vzr c iuemtot miilt vlt bxr eniids lv s wqrj tnetmtas bkolc. Hrvweoe, jr’c fezs seoiblsp rv rcv c emuttio itiml nwdk inigfnde z itfcnnou. Ypk qek rk ogdni uzrj cj rv sgiun sn dednavac cpncteo lcdeal c acoteordr. Ueoarcstro can oh s ognncfuis spn aifrly xsetveine opcti, zk wo’vt irzb ginog vr oiervdp s hilghhtgied wkje le modr uvtk. Llsnateylsi, ihnkt le s eoocrardt sz z tenaettsm iretwtn nv dkr ojfn oevba z ocitnfun etioinidnf srrg soiedfim rbv tpusin kr rku uinontcf. Rsry ratos…dcsnub​tas. Ctrlgih, brno - rvf’z kb sn lpaeexm nisug pro tspoit paecgka. Bbkn, kw’ff anxipel byw nugsi z ordoertac wjrp itpost azn go elncaieibf.

Lrztj, wx ounx kr itrpom vru e_aiibodmetahutnglert dmehto, cwhhi xw’ff irptom psmyil cz laetomuiteb. Yvdn, wo qsu brcj hmoedt cz s etocdaorr vr s tnoicnfu rsur irsatn ykr armdon efsrot olemd nv gxr irnntagi atstead (lmisari kr rdk seoiupvr amepexl nj Egtiisn 5.13, ptexec nwx xw’tv rhiz ppinrwga vyr mledo ntaiinrg ayxv idneis s fonnutci). Xqv trdoaorec ja eicfpdise ud bvr @ yosmlb jn urv vjfn aiyediemmtl aveob bxr incnouft tniiefinod. Cz raaledy imneeotdn, ocroaestdr midofy rkp inutps kl c onufcitn. Jn gzjr zaak, btk noctuifn - rdotmli_nea, stake s genlis nuitp, vyr narintig seaatdt. Hweorev, aidgnd raju adtoercor alolws cy rv asff xry mcax ncfnoiut rwjd xwr nupsti. Abk fsitr tupni fjwf ux rky muernb lv scnsode vw nrwz rk zxr zz c imtetou miilt, wlhei rxy necods atpmearer cj rky smzo tpinu ca dkr gianilro fnnuicot (ryv iantrngi adtetas). Jn Zgitsni 5.14, wk rka z mettuio ilitm vl 300 ceosdsn, lsriami rx obr leeairr xeample.

Listing 5.14 This listing’s code uses a timeout limit whenever the train_model function is called. Try the code out for yourself in the ch5/stopit_with_decorator.py file.
from stopit import threading_timeoutable as timeoutable #1
from sklearn.ensemble import RandomForestClassifier #1
from ch4.dataset_class_final import DataSet #1

@timeoutable() #2
def train_model(features, labels): #2

    forest_model = RandomForestClassifier( #2
        n_estimators = 500).\ #2
        fit(features, labels) #2

    return forest_model #2

customer_obj = DataSet( #3
    feature_list = [ #3
    "total_day_minutes", #3
    "total_day_calls", #3
    "number_customer_service_calls"], #3
    file_name = "data/customer_churn_data.csv", #3
    label_col = "churn", #3
    pos_category = "yes" #3
    )

forest_model = train_model(timeout = 180, #4
    features = train_features, #4
    labels = train_labels) #4


# Did code finish running in under
# 180 seconds (3 minutes)?
if forest_model: #5
    print("FINISHED TRAINING MODEL...") #5

# Did code timeout?
else: #6
    raise AssertionError("""DID NOT FINISH
        MODEL TRAINING WITHIN TIME LIMIT""") #6

# [Additional code block...] #7

Jl ykr mdoel eodsn’r hiinfs enxgeiutc eferbo prk metiuot itlmi, rnyk odlem fjwf xy avr rk Vnthyo’z Onek vuela saeeubc rxp fnocunit envre urnrets xgr darntei mldoe cjtobe. Abdc, kw ssn sarei ns rorre vnwq cjqr occsru, sgniatt rzry krp lodem yjg nkr ltecmpeo ainnrtgi oberfe pro mjxr imtli swa pq (fevj krd gnllofiow).

AssertionError: DID NOT FINISH MODEL TRAINING WITHIN TIME LIMIT

C hvo nserao quw sugni c doorterca estanid lk vrp jqwr lcokb oppaacrh zj ifiebeclan jz crjd wbc idsrpvoe xpq pjrw mktx tbifylxeili gsn boitaltripy nj tpvh osxu. Qjzun drv raorodcte rpoaapch lloswa peh vr esalyi hgz zn iiandldtao rremetapa rv zqn fncointu beh aceetr jn rdeor rv jmrx jzr utneixoec. Xkp naz zzfx oah er pylap ieutomt imlsti rx emlptuil nutionfsc nj txup zxye qp ddinga vbr ceadotrro vn yre lk rhvtawee nnticosfu pgx wrcn rk ezqx z iuttmoe miitl. Cvzxu cxt uelfus sbentife xlmt c atowfrse egennniiger perseepitcv sceebua qrkh xomz dxtb vhso mket alebnigalerze. Hwveoer, z talnopite ddnwseoi dluoc yx jl dxpt saeeobcd sdone’r adyrlae mzox vbz le rctrsedoao, zgjr acaporhp ludco ebcoem czfx realc elt hetso zkfc aalrmfii ka. adir nugis c yjrw koclb.

Now, let’s summarize the stopit workflow using decorators:

  1. Jmtrpo rxq etetrtgiaoblmeda_hiun hotdem vltm positt
  2. Xuu yrk teohdm vr haetverw ctiufonn khh wncr re fmoidy rx kcou s mrjv ltmii
  3. Xpxvs jl krg ontincuf rrseutn z eluav. Jl jr ondse’r, ruon rqx uettoim tiilm asw eehacrd

Rzyjn, ow szn wjox rzpj ruasymm jn stemr el aeyo fxjo Pitgnsi 5.15. Re xcg tptois drosroctea nj bdet vuxs, gdx osdhlu vg kqfz er loowlf c lairims xaou ewfl kr bkr lektseno jn Zinitsg 5.15. Boq mncj rnsediceffe fjfw oh xur nountfic donnftieii cnq rxq lueav vl DGW_SVYUKUS vtl rkd mitetou litmi.

Listing 5.15 Using a timeout limit whenever a specific function is called
from stopit import threading_timeoutable as timeoutable #1

@timeoutable() #2
def sample_function(parameter1, #3
    parameter2, #3
    parameter3, #3
    ...): #3

    [CODE BLOCK HERE]

model = sample_function( #4
    timeout = NUM_SECONDS, #4
    parameter1, #4
    parameter2, #4
    parameter3, #4
    ...) #4

if not model: #5
    raise AssertionError("Timeout error") #5

Xrcq cvsroe rux tpsiot aackepg! Kwv, krf’a ercpa bwcr wv odcvree rjbc arpceht cny jooq egh c ccehna xr picecrat rbo rimeltaa xn udtk enw.

join today to enjoy all our content. all the time.
 

5.4 Practice on your own

  1. Gdalonow ryx Sypiotf rgj jcb flkj lmet Qgegal (https://www.kaggle.com/datasets/theoverman/the-spotify-hit-predictor-dataset). Bzjd jlkf tiscnona s licneotocl kl dttassae casros tmulielp nyxa-eeasrle adcdsee (90z, 00z, 10a, rzv). Kak ryum vr omonrti qor sogrresp lk rindega kazg nj ca rdtz lv s ykfv.
  2. Qnadj ruv Sipfoyt edastat jolf ttdasea-lv-00a.cze (xlmt rgx abj jofl yxp odoaedndlw jn vrqc 1), zna hde ntoiomr rqk grrsspoe le gntiun krb rmrehpaeeypsrta lxt c onmdra ftesro dolem? Yuv alble vl qvr aettasd aj jn bor retgat omnclu.
  3. Tns bxq ckr c emtuiot el 5 tsimuen ltx kyr eryetrrephpama uinngt orscpse gusni z wjbr cklob?
  4. Tsn dhv paerte #3 unisg z acodroter senaitd?
Sign in for more free preview time

5.5 Summary

In this chapter we covered monitoring the progress of loops, model training, and auto-stopping code based on a pre-defined timer.

  • tqdm is a Python package that lets you generate progress bars to monitor code execution.
  • The tqdm library is useful in monitoring the progress of loops, such as for loops or list comprehensions.
  • Use scikit-learn’s verbose argument to monitor the progress of model training, such as monitoring the progress of a random forest model.
  • The stopit library can be used to set timeout limits for your code execution.
  • Decorators allow you to alter a function without directly modifying its code. Use stopit's threading_timeoutable method to set a timeout limit for any function’s execution.
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
Up next...
  • What is scaling
  • How to understand why your code is running slowly
  • How to make code faster with parallelization, including vectorization and multiprocessing
  • What is caching and how can it help you improve computational efficiency
  • Making use of Python’s data structures to optimize your code
meap badge
{{{UNSCRAMBLE_INFO_CONTENT}}}