4 Graph attention networks
This chapter covers
- Understanding attention and how it’s applied to graph attention networks
- Knowing when to use GAT and GATv2 layers in PyTorch Geometric
- Using mini-batching via the
NeighborLoader
class - Implementing and applying graph attention networks layers in a spam detection problem
In this chapter, we extend our discussion of convolutional graph neural network (convolutional GNN) architectures by looking at a special variant of such models, the graph attention network (GAT). While these GNNs use convolution as introduced in the previous chapter, they extend this idea with an attention mechanism to highlight important nodes in the learning process [1, 2]. In contrast to the conventional convolutional GNN, which weights all nodes equally, the attention mechanism allows the GAT to learn what aspects in its training to put extra emphasis on.
As with convolution, attention is a widely used mechanism in deep learning outside of GNNs. Architectures that rely on attention (particularly transformers) have seen such success in addressing natural language problems that they now dominate the field. It remains to be seen if attention will have a similar effect in the graph world.