10 Sequence-to-sequence models and attention

This chapter covers

Mapping one text sequence to another with a neural network
Understanding sequence-to-sequence tasks and how they’re different from the others you’ve learned about
Using encoder-decoder model architectures for translation and chat
Training a model to pay attention to what is important in a sequence

You now know how to create natural language models and use them for everything from sentiment classification to generating novel text (see chapter 9).

Could a neural network translate from English to German? Or even better, would it be possible to predict disease by translating genotype to phenotype (genes to body type)?^[1] And what about the chatbot we’ve been talking about since the beginning of the book? Can a neural net carry on an entertaining conversation? These are all sequence-to-sequence problems. They map one sequence of indeterminate length to another sequence whose length is also unknown.

¹ geno2pheno: https://academic.oup.com/nar/article/31/13/3850/2904197.

In this chapter, you’ll learn how to build sequence-to-sequence models using an encoder-decoder architecture.

10.1 Encoder-decoder architecture

Which of our previous architectures do you think might be useful for sequence-to-sequence problems? The word vector embedding model of chapter 6? The convolutional net of chapter 7 or the recurrent nets of chapter 8 and chapter 9? You guessed it; we’re going to build on the LSTM architecture from the last chapter.

10 Sequence-to-sequence models and attention

This chapter covers

10.1 Encoder-decoder architecture

10.1.1 Decoding thought

10.1.2 Look familiar?

10.1.3 Sequence-to-sequence conversation

10.1.4 LSTM review

10.2 Assembling a sequence-to-sequence pipeline

10.2.1 Preparing your dataset for the sequence-to-sequence training

10.2.2 Sequence-to-sequence model in Keras

10.2.3 Sequence encoder

10.2.4 Thought decoder

10 Sequence-to-sequence models and attention

This chapter covers

10.1 Encoder-decoder architecture

10.1.1 Decoding thought

10.1.2 Look familiar?

10.1.3 Sequence-to-sequence conversation

10.1.4 LSTM review

10.2 Assembling a sequence-to-sequence pipeline

10.2.1 Preparing your dataset for the sequence-to-sequence training

10.2.2 Sequence-to-sequence model in Keras

10.2.3 Sequence encoder

10.2.4 Thought decoder

Unable to load book!