chapter four

Chapter 4. Sequence-to-sequence models and attention

Chapter 10 from Natural Language Processing in Action by Hobson Lane, Cole Howard, and Hannes Max Hapke

This chapter covers

Mapping one text sequence to another with a neural network
Understanding sequence-to-sequence tasks and how they’re different from the others you’ve learned about
Using encoder-decoder model architectures for translation and chat
Training a model to pay attention to what is important in a sequence

You now know how to create natural language models and use them for everything from sentiment classification to generating novel text (see chapter 9).

Could a neural network translate from English to German? Or even better, would it be possible to predict disease by translating genotype to phenotype (genes to body type)?^[1] And what about the chatbot we’ve been talking about since the beginning of the book? Can a neural net carry on an entertaining conversation? These are all sequence-to-sequence problems. They map one sequence of indeterminate length to another sequence whose length is also unknown.

¹ geno2pheno: https://academic.oup.com/nar/article/31/13/3850/2904197.

In this chapter, you’ll learn how to build sequence-to-sequence models using an encoder-decoder architecture.

Chapter 4. Sequence-to-sequence models and attention

This chapter covers

10.1. Encoder-decoder architecture

10.2. Assembling a sequence-to-sequence pipeline

10.3. Training the sequence-to-sequence network

10.4. Building a chatbot using sequence-to-sequence networks

10.5. Enhancements

10.6. In the real world

Summary