9 Improving retention with long short-term memory networks

 

This chapter covers

  • Adding deeper memory to recurrent neural nets
  • Gating information inside neural nets
  • Classifying and generating text
  • Modeling language patterns

For all the benefits recurrent neural nets provide for modeling relationships, and therefore possibly causal relationships, in sequence data they suffer from one main deficiency: a token’s effect is almost completely lost by the time two tokens have passed.[1] Any effect the first node has on the third node (two time steps after the first time step) will be thoroughly stepped on by new data introduced in the intervening time step. This is important to the basic structure of the net, but it prevents the common case in human language that the tokens may be deeply interrelated even when they’re far apart in a sentence.

Take this example:

The young woman went to the movies with her friends.

The subject “woman” immediately precedes its main verb “went.”[2] You learned in the previous chapters that both convolutional and recurrent nets would have no trouble learning from that relationship.

2  “Went” is the predicate (main verb) in this sentence. Find additional English grammar terminology at - https://www.butte.edu/departments/cas/tipsheets/grammar/sentence_structure.html.

But in a similar sentence:

The young woman, having found a free ticket on the ground, went to the movies.

9.1 LSTM

9.1.1 Backpropagation through time

9.1.2 Where does the rubber hit the road?

9.1.3 Dirty data

9.1.4 Back to the dirty data

9.1.5 Words are hard. Letters are easier.

9.1.6 My turn to chat

9.1.7 My turn to speak more clearly

9.1.8 Learned how to say, but not yet what

sitemap