6 Sequence to Sequence Models

This chapter covers

Building a machine translation system using fairseq
Transforming one sentence to another using a Seq2Seq model
Using a beam search decoder to generate better output
Evaluating the quality of machine translation systems
Building a dialog system (chatbot) using a Seq2Seq model

In this chapter, we are going to discuss sequence to sequence (Seq2Seq) models, which are some of the most important complex NLP models and are used for a wide range of applications including machine translation. Nowadays, Seq2Seq models and their variations are already used as the fundamental building blocks in many real-world applications, including Google Translate, speech recognition, and so on. We are going to build a simple neural machine translation system using a powerful framework to learn how the models work and how to generate the output using greedy and beam search algorithms. At the end of this chapter, we will build a chatbot—an NLP application with which you can have a conversation. We’ll also discuss challenges and limitations of simple Seq2Seq models.

6.1 Introduction to sequence to sequence models

6.2 Machine Translation 101

6.3 Building your first translator

6.3.1 Preparing the datasets

6.3.2 Training the model

6.3.3 Running the translator

6.4 How Seq2Seq models work

6.4.1 Encoder

6.4.2 Decoder

6.4.3 Greedy decoding

6.4.4 Beam search decoding

6.5 Evaluating translation systems

6.5.1 Human evaluation

6.5.2 Automatic evaluation

6.6 Case study: building a chatbot

6.6.1 Introduction to dialog systems

6.6.2 Preparing a dataset

6.6.3 Training and running a chatbot

6.6.4 Next steps

6.7 Summary