3 Transformers: How Inputs Become Outputs

This chapter covers

Converting tokens into vectors
Transformers, their types and roles
Converting vectors back into tokens
Creating the text generation loop

In Chapter 2, we showed how Large Language Models (LLMs) see text as fundamental units known as tokens. Now it’s time to talk about what LLMs do with the tokens they see. The process that LLMs use to generate their text is markedly different from how humans form coherent sentences. When an LLM operates, it is working on tokens, yet simultaneously can not manipulate tokens like humans do because the LLM does not understand the structure and relationship of the letters each token represents.

For example, English speakers know that the words “magic”, “magical”, and “magician” are all related. We can understand that sentences containing these words are all connected to the same subject matter because these words share a common root. However, LLMs that operate on integers representing tokens that make up these words cannot understand the relationships between tokens without additional work to make those connections.

3.1 The Transformer Model

3.1.1 Layers of the Transformer Model

3.2 Exploring the Transformer Architecture in Detail

3.2.1 Embedding Layers

3.2.2 Representing Tokens with Vectors

3.2.3 Adding Positional Information

3.2.4 Transformer Layers

3.2.5 Unembedding Layers

3.2.6 Sampling tokens to produce output

3.3 The trade-off between creativity and topical responses

3.4 Transformers in context

3.5 Summary