11 Sequence-to-sequence
This chapter covers
- Preparing a sequence-to-sequence dataset and loader
- Combining RNNs with attention mechanisms
- Building a machine translation model
- Interpreting attentionscores to understand a model’s decisions
Now that we have learned about attention mechanisms, we can wield them to build something new and powerful. In particular, we will develop an algorithm known as sequence-to-sequence (Seq2Seq for short) that can perform machine translation. As the name implies, this is an approach for getting neural networks to take one sequence as input and produce a different sequence as the output. Seq2Seq has been used to get computers to perform symbolic calculus,1 summarize long documents,2 and even translate from one language to another. I’ll show you step by step how we can translate from English to French. In fact, Google used essentially the same approach as its production machine-translation tool, and you can read about it at https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html. If you can imagine your inputs/outputs as sequences of things, there is a good chance Seq2Seq can help you solve the task.