chapter sixteen

16 Text generation

 

This chapter covers

  • A brief history of generative modeling
  • Training a miniature GPT model from scratch
  • Using a pretrained Transformer model to build a chatbot
  • Building a multimodal model that can describe images in natural language

In 2014, the idea that, in a not-so-distant future, most of the cultural content we consume would be created with substantial help from AI was met with utter disbelief, even among long-time machine learning practitioners. Fast-forward a decade, and that disbelief has receded at an incredible speed. Generative AI tools are now common additions to word processors, image editors, and development environments. Prestigious awards are going out to literature and art created with generative models—causing considerable controversy and debate. (In 2022, Jason Allen used the image generation software Midjourney to win an award for digital artists, and in 2024, Rie Kudan won one of Japan’s most prestigious literary awards for a novel written with substantial help from generative software.) It no longer feels like science fiction to consider a world where AI and artistic endeavors are often intertwined.

In any practical sense, AI is nowhere close to rivaling human screenwriters, painters, or composers. But replacing humans need not, and should not, be the point. In many fields, but especially in creative ones, people will use AI to augment their capabilities—more augmented intelligence than artificial intelligence.

16.1 A brief history of sequence generation

16.2 Training a mini-GPT

16.2.1 Building the model

16.2.2 Pretraining the model

16.2.3 Generative decoding

16.2.4 Sampling strategies

16.3 Using a pretrained LLM

16.3.1 Text generation with the Gemma model

16.3.2 Instruction fine-tuning

16.3.3 Low-Rank Adaptation (LoRA)

16.4 Going further with LLMs

16.4.1 Reinforcement learning with human feedback (RLHF)

16.4.2 Multimodal LLMs

16.4.3 Retrieval-augmented generation (RAG)

16.4.4 “Reasoning” models

16.5 Where are LLMs heading next?

Summary