17 Using predictions to make more predictions
This chapter covers
- Examining the autoregressive LSTM architecture
- Discovering the caveat of the autoregressive LSTM
- Implementing an autoregressive LSTM
In the last chapter, we examined and built a convolutional neural network or CNN. We even combined it with the LSTM architecture to test if we could outperform the LSTM models. The results were mixed, as the CNN models performed worse as a single-step model, performed best as a multi-step model, and performed equally well as a multi-output model.
Now, we focus entirely on the multi-step model, as all of them output the entire sequence of predictions in a single shot. Instead, we could gradually output the prediction sequence and use past predictions to make new predictions. That way, the model is doing rolling forecasts, but using its own predictions to inform the prediction.
This architecture is commonly used with LSTM and is called autoregressive LSTM or ARLSTM. In this chapter, we first explore the general architecture of the ARLSTM model, and then build it in Keras to see if we can build a new top performing multi-step model.
17.1 Examining the ARLSTM architecture
In previous chapters, we have built many multi-step models that all output predictions for the traffic volume in the next 24 hours. Each model generated the entire prediction sequence in a single shot, meaning that we would get 24 values of the model right away.