11 Sequence to Sequence

This chapter covers

Preparing a sequence-to-sequence dataset and loader.
Combining Recurrent Neural Networks with Attention Mechanisms.
Building a machine translation model.
Interpreting attention scores to interpret a model’s decisions.

Now that we have learned about attention mechanisms, we can wield them to build something new and powerful. In particular, we are going to develop an algorithm known as “Sequence-to-Sequence” (Seq2Seq for short) that can perform machine translation. As the name implies, this is an approach for getting neural networks to take one sequence as input, and produce a different sequence as the output. This has been used to get computers to perform symbolic calculus^[1], summarize long documents^[2], and even translate from one language to another. I’ll show you step-by-step how we can do that to translate from English to French. In fact, Google used essentially the same approach as their production machine-translation tool and you can read about it at their blog https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html.

11.1 Sequence to Sequence

11.2 Machine Translation and the Data Loader

11.3 Inputs to Seq2Seq

11.3.1 Autoregressive Approach

11.3.2 Teacher Forcing Approach

11.3.3 Teacher Forcing vs Auto-Regressive

11.4 Seq2Seq with Attention

11.4.1 Implementing Seq2Seq

11.5 Exercises

11.6 Summary