10 Capstone project: forecasting daily visits to a blog

 

This chapter covers

  • Making accurate predictions for the daily traffic volume to a blog’s website.

First of all, congratulations on making it this far. Throughout this book, we have discovered and implemented many large time models, all with their advantages and disadvantages, and we experimented with their zero-shot forecasting capabilities and fine-tuned them to specific scenarios.

Now, we cement our learning with this capstone project. The goal of this chapter is for you to apply what you have learned throughout the book in a new scenario, using a new dataset. While a suggested solution is provided, the main idea is still for you to experiment with the different approaches, design your own experiments and adjust each model to try to generate the most accurate forecasts possible.

Specifically, for this project, the objective is to forecast the daily number of visitors to a blog’s website. Here, to ensure that the data was not seen by any foundation model, as we now know that they are trained on massive amounts of time series data, I specifically extracted the data from my own blog (https://www.datasciencewithmarco.com/blog). The data starts on January 1st, 2021, and ends on October 12th, 2023. It compiles the daily number of visitors, and it also includes an indicator for when a new article is published, and when there is a holiday. The dataset is plotted in figure 10.1.

10.1 Step-by-step walkthrough of the project

10.1.1 Setting the constants

10.1.2 Forecasting with a seasonal naïve model

10.1.3 Forecasting with ARIMA

10.1.4 Forecasting with TimeGPT

10.1.5 Forecasting with Chronos

10.1.6 Forecasting with Moirai

10.1.7 Forecasting with TimesFM

10.1.8 Forecasting with Time-LLM

10.1.9 Evaluating all models

10.2 Next steps