11 Sequence-to-sequence

 

This chapter covers

  • Preparing a sequence-to-sequence dataset and loader
  • Combining RNNs with attention mechanisms
  • Building a machine translation model
  • Interpreting attentionscores to understand a model’s decisions

Now that we have learned about attention mechanisms, we can wield them to build something new and powerful. In particular, we will develop an algorithm known as sequence-to-sequence (Seq2Seq for short) that can perform machine translation. As the name implies, this is an approach for getting neural networks to take one sequence as input and produce a different sequence as the output. Seq2Seq has been used to get computers to perform symbolic calculus,1 summarize long documents,2 and even translate from one language to another. I’ll show you step by step how we can translate from English to French. In fact, Google used essentially the same approach as its production machine-translation tool, and you can read about it at https://ai.googleblog.com/2016/09/a-neural-network-for-machine.html. If you can imagine your inputs/outputs as sequences of things, there is a good chance Seq2Seq can help you solve the task.

11.1 Sequence-to-sequence as a kind of denoising autoencoder

 
 

11.1.1  Adding attention creates Seq2Seq

 
 

11.2 Machine translation and the data loader

 
 
 
 

11.2.1  Loading a small English-French dataset

 
 
 

11.3 Inputs to Seq2Seq

 
 

11.3.1  Autoregressive approach

 
 

11.3.2  Teacher-forcing approach

 
 
 
 

11.3.3  Teacher forcing vs. an autoregressive approach

 
 
 
 

11.4 Seq2Seq with attention

 
 

11.4.1  Implementing Seq2Seq

 
 
 

11.4.2  Training and evaluation

 
 
 

Exercises

 
 
 
 

Summary

 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest