chapter eight

8 Reduce, reuse, recycle your words (RNNs and LSTMs)

This chapter covers

Unrolling recursion so you can understand how to use it for NLP
Implementing word and character-based RNNs in PyTorch
Identifying applications where RNNs are your best option
Re-engineering your datasets for training RNNs
Customizing and tuning your RNN structure for your NLP problems
Understanding backprop (backpropagation) in time
Combining long and short term memory mechanisms to make your RNN smarter

An RNN (Recurrent Neural Network) recycles tokens. Why would you want to recycle and reuse your words? To build a more sustainable NLP pipeline of course! ;) Recurrence is just another word for recycling. An RNN uses recurrence to allow it to remember the tokens it has already read and reuse that understanding to predict the target variable. And if you use RNNs to predict the next word, RNNs can generate, going on and on and on, until you tell them to stop. This sustainability or regenerative ability of RNNs is their super power.

8.1 What are RNNs good for?

8.1.1 RNNs remember everything you tell them

8.1.2 RNNs hide their understanding

8.1.3 RNNs remember everything you tell them

8.2 Predict someone’s nationality from only their last name

8.2.1 Build an RNN from scratch

8.2.2 Training an RNN, one token at a time

8.2.3 Understanding the results

8.2.4 Multiclass classifiers vs multi-label taggers

8.3 Backpropagation through time

8.3.1 Initializing the hidden layer in an RNN

8.4 Remembering with recurrent networks

8.4.1 Word-level Language Models

8.4.2 Gated Recurrent Units (GRUs)

8.4.3 Long and Short-Term Memory (LSTM)

8.4.4 Give your RNN a tuneup