15 Remembering the past with LSTM

 

This chapter covers

  • Examining the long short-term memory (LSTM) architecture
  • Implementing an LSTM with Keras

In the last chapter, we built our first models in deep learning, implementing both linear and deep neural network models. In the case of our dataset, we saw that both models outperformed the baselines we built in chapter 13, with the deep neural network being the best model for single-step, multi-step, and multi-output tasks.

Now we’ll explore a more advanced architecture called long short-term memory (LSTM), which is a particular case of a recurrent neural network (RNN). This type of neural network is used to process sequences of data, where the order matters. One common application of RNN and LSTM is in natural language processing. Words in a sentence have an order, and changing that order can completely change the meaning of a sentence. Thus, we often find this architecture behind text classification and text generation algorithms.

Another situation where the order of data matters is time series. We know that time series are sequences of data equally spaced in time, and that their order cannot be changed. The data point observed at 9 a.m. must come before the data point at 10 a.m. and after the data point at 8 a.m. Thus, it makes sense to apply the LSTM architecture for forecasting time series.

15.1 Exploring the recurrent neural network (RNN)

15.2 Examining the LSTM architecture

15.2.1 The forget gate

15.2.2 The input gate

15.2.3 The output gate

15.3 Implementing the LSTM architecture

15.3.1 Implementing an LSTM as a single-step model

15.3.2 Implementing an LSTM as a multi-step model

15.3.3 Implementing an LSTM as a multi-output model

15.4 Next steps

15.5 Exercises

Summary