chapter four

4 Deep Learning Accelerates

 

This chapter covers

  • The resurgence of recurrent neural networks (RNNs) after AlexNet
  • Karpathy’s blog made RNNs accessible, inspiring experimentation
  • Chris Olah clarified LSTMs with vivid visuals and metaphors
  • Selective dropout enabled deeper recurrent networks
  • Deep Speech 2 proved the real-world potential of RNNs
  • The engineering shift in artificial intelligence

Recurrent neural networks (RNNs) are sequence models. They carry context in a hidden state, so earlier inputs affect later predictions. This allows RNNs to model temporal dependencies and handle inputs of varying lengths. Despite their theoretical promise, practical issues such as vanishing or exploding gradients, poor handling of long-range dependencies, and inefficient training limit their effectiveness.

4.1 The Unreasonable Effectiveness of Recurrent Neural Networks

4.2 Understanding LSTM Networks

4.3 Recurrent Neural Network Regularization

4.4 Deep Speech 2

4.4.1 Core Architecture

4.4.2 Training Techniques

4.4.3 Language Models and Decoding

4.4.4 Significance and Broader Impact

4.4.5 An Engineering Shift