4 Zero-Shot probabilistic forecasting with Lag-Llama
This chapter covers
- Exploring the architecture of Lag-Llama
- Forecasting with Lag-Llama
- Fine-tuning Lag-Llama
In the previous chapter, we explored TimeGPT, a proprietary foundation model developed by Nixtla. While it comes with an API that is easy and intuitive to use, it will eventually be a paid solution, which might deter some practitioners from using it.
Thus, we now explore Lag-Llama, an open-source foundation model that quickly followed TimeGPT. On top of being an open-source model, there are more key differences to Lag-Llama when compared to TimeGPT.
On top of being open-source, at the time of writing, Lag-Llama can only be used by cloning the code base. This means that there is no Python package or API to interact with the model. Also, Lag-Llama only supports univariate forecasting, so only one series at a time can be predicted and no exogenous features can be included. Finally, anomaly detection is not an explicit functionality of Lag-Llama.
Now that we have a general idea of the capabilities of Lag-Llama, let’s explore it in more detail and discover its architecture.
4.1 Exploring Lag-Llama
As mentioned before, Lag-Llama is a probabilistic forecasting model, meaning that instead of outputting point forecasts, it outputs a distribution of possible future values [1].