chapter one

1 Understanding foundation models

 

This chapter covers

  • Defining foundation models
  • Exploring the transformer architecture
  • Understanding the advantages and drawbacks of foundation models
  • Using foundation models for time-series forecasting

Foundation models represent a major paradigm shift in machine learning. Traditionally, we build data-specific models, meaning that each model is trained on a dataset specific to a particular scenario. Thus, the model specializes in a single use case. In another situation, another model must be trained with data specific to that situation.

We are finding more ways to apply and interact with foundation models. Video meeting applications such as Microsoft Teams use foundation models to summarize the key points of a presentation. Canva, which builds web-based design tools, enables users to create an image from a text input using the DALL-E model developed by OpenAI. Also, millions of people have interacted with ChatGPT; the free version uses the GPT-5 model to generate text and code. Finally, Toys“R”Us created a video ad using Sora, a foundation model that generates video from text [1]. This book, however, focuses on foundation models applied to time-series forecasting, which itself can be applied to a wide range of applications, including weather forecasting and demand planning.

1.1 Defining a foundation model

1.2 Exploring the transformer architecture

1.2.1 Feeding the encoder

1.2.2 Inside the encoder

1.2.3 Making predictions

1.3 Advantages and disadvantages of foundation models

1.3.1 Benefits of foundation forecasting models

1.3.2 Drawbacks of foundation forecasting models

1.4 Next steps

Summary