10 Attention mechanisms

 

This chapter covers

  • Understanding attention mechanisms and when to use them
  • Adding context to an attention mechanism for context-sensitive results
  • Handling variable-length items with attention

Imagine having a conversation with a couple of friends at a busy coffee shop. Around you are other conversations and people placing orders and talking on their cell phones. Despite all this noise, you, with your complex and sophisticated brain and ears, can pay attention to only what is important (your friends!), and selectively ignore the things occurring around you that are not relevant. The important thing here is that your attention is adaptive to the situation. You ignore the background sounds to listen to your friends only when there is nothing more important happening. If a fire alarm goes off, you stop paying attention to your friends and focus your attention on this new, important sound. Thus, attention is about adapting to the relative importance of inputs.

10.1 Attention mechanisms learn relative input importance

10.1.1  Training our baseline model

10.1.2  Attention mechanism mechanics

10.1.3  Implementing a simple attention mechanism

10.2 Adding some context

10.2.1  Dot score

10.2.2  General score

10.2.3  Additive attention

10.2.4  Computing attention weights

10.3 Putting it all together: A complete attention mechanism with context

Exercises

Summary

sitemap