10 Attention Mechanisms

 

This chapter covers

  • What are attention mechanisms and when do we use them.
  • Adding context to an attention mechanism.
  • How attention can handle variable-length items.

In this chapter we will learn another type of prior belief we may impose upon our network. Specifically, an approach called an attention mechanism. If you believe that there are portions (i.e., features) of your input that are more or less important, depending on what other features are present, then you should consider looking at an attention based approach. Attention is also a fundamental component of many new architectures and approaches that have been developed in the past three years. If you want to get the state-of-the-art results on speech recognition, object detection, a chat bot, or machine translation, you are probably going to be using an attention mechanism.

10.1 Attention Mechanisms

10.1.1 Attention Mechanism Mechanics

10.1.2 Implementing a Simple Attention Mechanism

10.2 Adding Some Context

10.2.1 Dot Score

10.2.2 General Score

10.2.3 Additive Attention

10.2.4 Computing Attention Weights

10.3 Putting it All Together

10.4 Exercises

10.5 Summary

sitemap