4 Deep Learning Accelerates

 

This chapter covers

  • The resurgence of recurrent neural networks (RNNs) after AlexNet
  • Karpathy’s blog made RNNs accessible, inspiring experimentation
  • Chris Olah clarified LSTMs with vivid visuals and metaphors
  • Selective dropout enabled deeper recurrent networks
  • Deep Speech 2 proved the real-world potential of RNNs
  • The engineering shift in artificial intelligence

Recurrent neural networks (RNNs) are purpose-built for sequential tasks due to their ability to maintain and leverage information from previous inputs through hidden states. This allows RNNs to model temporal dependencies and handle inputs of varying lengths. Despite their theoretical promise, practical issues such as vanishing or exploding gradients, poor handling of long-range dependencies, and inefficient training often limit their effectiveness.

4.1 The Unreasonable Effectiveness of Recurrent Neural Networks

4.2 Understanding LSTM Networks

4.3 Recurrent Neural Network Regularization

4.4 Deep Speech 2

4.4.1 Core Architecture

4.4.2 Training Techniques

4.4.3 Architectural Enhancements

4.4.4 Language Models and Decoding

4.4.5 Significance and Broader Impact

4.4.6 An Engineering Shift