Chapter 4. Sequence-to-sequence models and attention
Chapter 10 from Natural Language Processing in Action by Hobson Lane, Cole Howard, and Hannes Max Hapke
This chapter covers
- Mapping one text sequence to another with a neural network
- Understanding sequence-to-sequence tasks and how they’re different from the others you’ve learned about
- Using encoder-decoder model architectures for translation and chat
- Training a model to pay attention to what is important in a sequence
You now know how to create natural language models and use them for everything from sentiment classification to generating novel text (see chapter 9).
Could a neural network translate from English to German? Or even better, would it be possible to predict disease by translating genotype to phenotype (genes to body type)?[1] And what about the chatbot we’ve been talking about since the beginning of the book? Can a neural net carry on an entertaining conversation? These are all sequence-to-sequence problems. They map one sequence of indeterminate length to another sequence whose length is also unknown.
1 geno2pheno: https://academic.oup.com/nar/article/31/13/3850/2904197.
In this chapter, you’ll learn how to build sequence-to-sequence models using an encoder-decoder architecture.