chapter two

2 Build a transformer

 

This chapter covers

  • How the attention mechanism assigns weights to elements in a sequence
  • Building an encoder–decoder transformer from scratch for language translation
  • Word embedding and positional encoding
  • Training a transformer from scratch to translate German to English

Understanding attention and transformer architectures is foundational for modern generative AI, especially for text-to-image models. This chapter comes at the very beginning of our journey to build a text-to-image generator from scratch for two reasons:

2.1 An overview of attention and transformers

2.1.1 How the attention mechanism works

2.1.2 How to create a transformer

2.2 Word embedding and positional encoding

2.2.1 Word tokenization with the Spacy library

2.2.2 A sequence padding function

2.2.3 Input embedding from word embedding and positional encoding

2.3 Creating an encoder–decoder transformer

2.3.1 Coding the attention mechanism

2.3.2 Defining the Transformer() class

2.3.3 Creating a language translator

2.4 Training and using the German-to-English translator

2.4.1 Training the encoder–decoder transformer

2.4.2 Translating German to English with the trained model

Summary