chapter four

4 Deep Learning Accelerates

This chapter covers

The resurgence of recurrent neural networks (RNNs) after AlexNet
Karpathy’s blog made RNNs accessible, inspiring experimentation
Chris Olah clarified LSTMs with vivid visuals and metaphors
Selective dropout enabled deeper recurrent networks
Deep Speech 2 proved the real-world potential of RNNs
The engineering shift in artificial intelligence

Recurrent neural networks (RNNs) are sequence models. They carry context in a hidden state, so earlier inputs affect later predictions. This allows RNNs to model temporal dependencies and handle inputs of varying lengths. Despite their theoretical promise, practical issues such as vanishing or exploding gradients, poor handling of long-range dependencies, and inefficient training limit their effectiveness.

4.1 The Unreasonable Effectiveness of Recurrent Neural Networks

4.2 Understanding LSTM Networks

4 Deep Learning Accelerates

This chapter covers

Papers

4.1 The Unreasonable Effectiveness of Recurrent Neural Networks

4.2 Understanding LSTM Networks

4.3 Recurrent Neural Network Regularization

4.4 Deep Speech 2

4.4.1 Core Architecture

4.4.2 Training Techniques

4.4.3 Language Models and Decoding

4.4.4 Significance and Broader Impact

4.4.5 An Engineering Shift