chapter one

1 The Elements of AI

 

This chapter covers

  • How large language models (LLMs) process inputs and generate outputs;
  • How LLMs represent information;
  • The transformer architecture that powers LLMs;
  • Different types of machine learning;
  • How LLMs and other AI models learn from data;
  • How convolutional neural networks are used to process images, video and audio with AI;
  • How different types of data are combined (e.g., produce images from text).

This chapter will help you understand how AI works and get you up to speed with many foundational AI topics. Since the latest AI boom, many of these topics, such as “embeddings” and “temperature,” are now widely discussed not just by AI practitioners but also by businesspeople and the general public. This chapter demystifies them.

Instead of just piling up definitions and writing textbook explanations, this chapter is a bit more opinionated than that. It points out common AI problems, misconceptions, and limitations based on my experience working in the field. In addition, the chapter is filled with practical curiosities and commentary. For example, it discusses why language generation is more expensive in French than in English, and it reveals how OpenAI hires armies of human workers to manually help “tame” ChatGPT. So, even if you already know all the topics covered in this chapter, reading it might provide you with a different take on them.

1.1 Large language models (LLMs)

1.1.1 Text generation

1.1.2 End of text

1.1.3 Chat

1.1.4 The system prompt

1.1.5 Calling external software functions

1.1.6 Retrieval-augmented generation (RAG)

1.2 Tokens

1.2.1 One token at a time

1.2.2 Billed by the token

1.2.3 What about languages other than English?

1.2.4 Why do LLMs need tokens anyway?

1.3 Embeddings

1.3.1 Machine learning and embeddings

1.3.2 Visualizing embeddings

1.3.3 Why embeddings are useful

1.3.4 Why LLMs struggle to analyze individual letters

1.4 The transformer architecture

1.4.1 Step 1: Initial embeddings

1.4.2 Step 2: Contextualization

1.4.3 Step 3: Predictions

1.4.4 Temperature

1.4.5 Can you get an LLM to always output the same thing?

1.4.6 Where to learn more

1.5 Machine learning

1.5.1 Deep learning

1.5.2 Types of machine learning

1.5.3 How LLMs are trained (and tamed)

1.5.4 A note on privacy

1.5.5 Loss

1.5.6 Stochastic Gradient Descent (SGD)

1.6 Convolutions (images, video and audio)

1.7 Multi-modal AI

1.8 No Free Lunch

1.9 Summary