chapter four

4 Graph Attention Networks (GATs)

This chapter covers

The notion of attention and how it is applied to GNNs
The GAT and GATv2 layers in PyG and when to use them
How to use mini-batching via the NeighborLoader class
Implementing and applying GAT layers in a spam detection problem

In the last chapter, we examined convolutional GNNs, focusing on GCN and GraphSage. In this chapter, we extend our discussion of convolutional GNNs by looking at a special variant of such models, the Graph Attention Network.

Figure 4.1 Mental model of GNN training process, with the subject of GATs in context.

As with the previous chapter, our goal is to understand and apply Graph Attention Networks (GATs). While these GNNs use convolution as introduced in the previous chapter, they extend this idea with an attention mechanism to highlight important nodes in the learning process. This is in contrast to conventional convolutional GNNs which weights all nodes equally.

As with convolution, attention is a widely used mechanism in deep learning outside of GNNs. Architectures that rely on attention (particularly transformers) have seen such success in addressing natural language problems that they now dominate the field. It remains to be seen if attention will have a similar impact in the graph world.

4.1 Overview of the fraud problem

4.2 Introducing our review spam dataset

4.2.1 Explaining the node features

4 Graph Attention Networks (GATs)

This chapter covers

Figure 4.1 Mental model of GNN training process, with the subject of GATs in context.

4.1 Overview of the fraud problem

4.2 Introducing our review spam dataset

4.2.1 Explaining the node features

4.2.2 Exploratory data analysis

4.2.3 Exploring the graph structure

4.2.4 Exploring the node features

4.3 Training baseline models

4.3.1 Non-GNN baselines

4.3.2 GCN baseline

4.4 Training GAT models

4.4.1 Neighborhood loader

4.5 Under the hood

4.5.1 Explaining attention and GAT models

4.6 Summary

4.7 References and further reading