11 Sequence to Sequence
This chapter covers
- Preparing a sequence-to-sequence dataset and loader.
- Combining Recurrent Neural Networks with Attention Mechanisms.
- Building a machine translation model.
- Interpreting attention scores to interpret a model’s decisions.
Now that we have learned about attention mechanisms, we can wield them to build something new and powerful. In particular, we are going to develop an algorithm known as “Sequence-to-Sequence” (Seq2Seq for short) that can perform machine translation. As the name implies, this is an approach for getting neural networks to take one sequence as input, and produce a different sequence as the output. This has been used to get computers to perform symbolic calculus[1], summarize long documents[2], and even translate from one language to another. I’ll show you step-by-step how we can do that to translate from English to French. In fact, Google used essentially the same approach as their production machine-translation tool and you can read about it at their blog https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html.