chapter four

4 Graph attention networks

This chapter covers

Understanding attention and how it’s applied to graph attention networks
Knowing when to use GAT and GATv2 layers in PyTorch Geometric
Using mini-batching via the NeighborLoader class
Implementing and applying graph attention networks layers in a spam detection problem

In this chapter, we extend our discussion of convolutional graph neural network (convolutional GNN) architectures by looking at a special variant of such models, the graph attention network (GAT). While these GNNs use convolution as introduced in the previous chapter, they extend this idea with an attention mechanism to highlight important nodes in the learning process [1, 2]. In contrast to the conventional convolutional GNN, which weights all nodes equally, the attention mechanism allows the GAT to learn what aspects in its training to put extra emphasis on.

As with convolution, attention is a widely used mechanism in deep learning outside of GNNs. Architectures that rely on attention (particularly transformers) have seen such success in addressing natural language problems that they now dominate the field. It remains to be seen if attention will have a similar effect in the graph world.

4.1 Detecting spam and fraudulent reviews

4.2 Exploring the review spam dataset

4.2.1 Explaining the node features

4.2.2 Exploratory data analysis

4.2.3 Exploring the graph structure

4.2.4 Exploring the node features

4.3 Training baseline models

4.3.1 Non-GNN baselines

4.3.2 GCN baseline

4.4 Training GAT models

4.4.1 Neighborhood loader and GAT models

4.4.2 Addressing class imbalance in model performance

4.4.3 Deciding between GAT and XGBoost

4.5 Under the hood

4.5.1 Explaining attention and GAT models

4.5.2 Over-smoothing

4.5.3 Overview of key GAT equations

Summary