16 Generative Large Language Models

This chapter covers

A brief history of generative modeling
Training a miniature GPT model from scratch
Using a pretrained transformer model to build a chatbot
Building a multi-modal model that can describe images in natural language

Now that we have covered the key building blocks for text modeling problems, we will turn our eye towards the open ended world of text generation. By scaling up the ideas from the last two chapters, we will build and use conversational models that have been trained on a significant portion of English language text available on the internet. We will discuss the potential and shortcomings of such models.

16.1 The potential of generative modeling

16.2 A brief history of sequence generation

16.3 Training a miniature GPT

16.3.1 Building the model

16.3.2 Pretraining the model

16.4 Generative decoding

16.5 Sampling strategies

16.6 Using a pretrained LLM

16.6.1 Prompting LLMs

16.7 Instruction fine-tuning an LLM

16.8 Low-Rank Adaptation (LoRA) fine-tuning

16.9 Reinforcement Learning with Human Feedback

16.9.1 Reinforcement Learning with Chain of Thought Reasoning

16.10 Beyond text data

16.10.1 Extending an LLM for image input

16.10.2 Retrieval Augmented Generation

16.10.3 Foundation models

16.11 Where are LLMs heading next?

16.12 Chapter Summary