4 Zero-Shot probabilistic forecasting with Lag-Llama

 

This chapter covers

  • Exploring the architecture of Lag-Llama
  • Forecasting with Lag-Llama
  • Fine-tuning Lag-Llama

In the previous chapter, we explored TimeGPT, a proprietary foundation model developed by Nixtla. While it comes with an API that is easy and intuitive to use, it will eventually be a paid solution, which might deter some practitioners from using it.

Thus, we now explore Lag-Llama, an open-source foundation model that was published around the time of the release of TimeGPT. On top of being an open-source model, there are more key differences to Lag-Llama when compared to TimeGPT.

On top of being open-source, at the time of writing, Lag-Llama can only be used by cloning the code base. As such, Lag-Llama is mostly used for quick proof-of-concepts or for research projects.

This means that there is no Python package or API to interact with the model. Also, Lag-Llama only supports univariate forecasting, so only one series at a time can be predicted and no exogenous features can be included. While anomaly detection can technically be done with Lag-Llama, it will not be covered here as this model is not meant to be used in production as mentioned before.

4.1 Exploring Lag-Llama

4.1.1 Architecture of Lag-Llama

4.1.2 Pretraining Lag-Llama

4.2 Forecasting with Lag-Llama

4.2.1 Setting up Lag-Llama

4.2.2 Zero-shot forecasting with Lag-Llama

4.2.3 Changing the context length in Lag-Llama

4.3 Fine-tuning Lag-Llama

4.4 Next steps

4.5 Summary

4.6 References