chapter seven

7 Learning and inference at scale

This chapter covers

Strategies for handling data overload in small systems
Recognizing graph neural network problems that require scaled resources
Seven robust techniques for mitigating problems arising from large data
Scaling graph neural networks and tackling scalability challenges with PyTorch Geometric

For most of our journey through graph neural networks (GNNs), we’ve explained key architectures and methods, but we’ve limited examples to problems of relatively small scale. Our reason for doing so was to allow you to access example code and data readily.

However, real-world problems in deep learning are not often so neatly packaged. One of the major challenges in real-world scenarios is training GNN models when the dataset is large enough to fit in memory or overwhelm the processor [1].

As we explore the challenges of scalability, it’s crucial to have a clear mental model of the GNN training process. Figure 7.1 revisits our familiar visualization of this process. At its core, the training of a GNN revolves around acquiring data from a source, processing this data to extract relevant node and edge features, and then using these features to train a model. As the data grows in size, each of these steps can become increasingly resource-intensive, making necessary the scalable strategies we’ll explore in this chapter.

Figure 7.1 Mental model for the GNN training process. We will focus on scaling our system for large data in this chapter.

7.1 Examples in this chapter

7.1.1 Amazon Products dataset

7.1.2 GeoGrid

7.2 Framing problems of scale

7 Learning and inference at scale

This chapter covers

Figure 7.1 Mental model for the GNN training process. We will focus on scaling our system for large data in this chapter.

7.1 Examples in this chapter

7.1.1 Amazon Products dataset

7.1.2 GeoGrid

7.2 Framing problems of scale

7.2.1 Root causes

7.2.2 Symptoms

7.2.3 Crucial metrics

7.3 Techniques for tackling problems of scale

7.3.1 Seven techniques

7.3.2 General Steps

7.4 Choice of hardware configuration

7.4.1 Types of hardware choices

7.4.2 Choice of processor and memory size

7.5 Choice of data representation

7.6 Choice of GNN algorithm

7.6.1 Time and space complexity

7.9.1 Example