chapter eight

8 Reduce, reuse, and recycle your words: RNNs and LSTMs

This chapter covers

Unrolling recursion, so you can understand how to use it for NLP
Implementing word and character-based recurrent neural networks (RNNs) in PyTorch
Identifying applications where RNNs are your best option
Understanding backpropagation in time
Making your RNN smarter with long- and short-term memory

Recurrent neural networks (RNNs) are a game changer for NLP. They have spawned an explosion of practical applications and advancements in deep learning and AI, including real-time transcription and translation on mobile phones, high-frequency algorithmic trading, and efficient code generation. RNNs recycle tokens, but why would you want to recycle and reuse your words? To build a more sustainable NLP pipeline, of course! Recurrence is just another word for recycling. An RNN uses recurrence to remember the tokens it has already read and reuse that understanding to predict the target variable. If you use RNNs to predict the next word, RNNs can generate, going on and on and on, until you tell them to stop. This sustainability or regenerative ability of RNNs is their superpower.

8.1 What are RNNs good for?

8.1.1 RNN sequence handling

8.1.2 RNNs remember everything you tell them

8.1.3 RNNs hide their understanding

8.1.4 RNNs remember everything you tell them

8.2 Predicting nationality with only a last name

8.2.1 Building an RNN from scratch

8.2.2 Training an RNN, one token at a time

8.2.3 Understanding the results

8.2.4 Multiclass classifiers vs. multi-label taggers

8.3 Backpropagation through time

8.3.1 Initializing the hidden layer in an RNN

8.4 Remembering with recurrent networks

8.4.1 Word-level language models

8.4.2 Gated recurrent units

8.4.3 Long short-term memory

8.4.4 Giving your RNN a tune-up

8.5 Predicting

8.6 Test yourself

Summary