chapter eleven

11 Sequence-to-sequence learning: Part 1

This chapter covers

Understanding sequence-to-sequence data
Building a sequence-to-sequence machine translation model
Training and evaluating sequence-to-sequence models
Repurposing the trained model to generate translations for unseen text

In the previous chapter, we discussed solving an NLP task known as language modeling with deep recurrent neural networks. In this chapter, we are going to further our discussion and learn how we can use recurrent neural networks to solve more complex tasks. We will learn about a variety of tasks in which an arbitrary-length input sequence is mapped to another arbitrary-length sequence. Machine translation is a very appropriate example of this that involves converting a sequence of words in one language to a sequence of words in another.

11.1 Understanding the machine translation data

11.2 Writing an English-German seq2seq machine translator

11.2.1 The TextVectorization layer

11.2.2 Defining the TextVectorization layers for the seq2seq model

11.2.3 Defining the encoder

11.2.4 Defining the decoder and the final model

11.2.5 Compiling the model

11.3 Training and evaluating the model

11.4 From training to inference: Defining the inference model

Summary

Answers to exercises