chapter six

6 Sequence-to-sequence models

This chapter covers

Building a machine translation system using Fairseq
Transforming one sentence to another using a Seq2Seq model
Using a beam search decoder to generate better output
Evaluating the quality of machine translation systems
Building a dialogue system (chatbot) using a Seq2Seq model

In this chapter, we are going to discuss sequence-to-sequence (Seq2Seq) models, which are some of the most important complex NLP models and are used for a wide range of applications, including machine translation. Seq2Seq models and their variations are already used as the fundamental building blocks in many real-world applications, including Google Translate and speech recognition. We are going to build a simple neural machine translation system using a powerful framework to learn how the models work and how to generate the output using greedy and beam search algorithms. At the end of this chapter, we will build a chatbot—an NLP application with which you can have a conversation. We’ll also discuss the challenges and limitations of simple Seq2Seq models.

6.1 Introducing sequence-to-sequence models

6.2 Machine translation 101

6.3 Building your first translator

6.3.1 Preparing the datasets

6.3.2 Training the model

6.3.3 Running the translator

6.4 How Seq2Seq models work

6.4.1 Encoder

6.4.2 Decoder

6.4.3 Greedy decoding

6.4.4 Beam search decoding

6.5 Evaluating translation systems

6.5.1 Human evaluation

6.5.2 Automatic evaluation

6.6 Case study: Building a chatbot