chapter two

2 Graph embeddings

This chapter covers

Exploring graph embeddings and their importance
Creating node embeddings using non-GNN and GNN methods
Comparing node embeddings on a semi-supervised problem
Taking a deeper dive into embedding methods

Graph embeddings are essential tools in graph-based machine learning. They transform the intricate structure of graphs—be it the entire graph, individual nodes, or edges—into a more manageable, lower-dimensional space. We do this to compress a complex dataset into a form that’s easier to work with, without losing its inherent patterns and relationships, the information to which we’ll apply a graph neural network (GNN) or other machine learning method.

Graphs, as we’ve learned, encapsulate relationships and interactions within networks, whether they’re social networks, biological networks, or any system where entities are interconnected. Embeddings capture these real-life relationships in a compact form, facilitating tasks such as visualization, clustering, or predictive modeling.

There are numerous strategies to derive these embeddings, each with its unique approach and application: from classical graph algorithms that use the network’s topology, to linear algebra techniques that decompose matrices representing the graph, and more advanced methods such as GNNs [1]. GNNs stand out because they can integrate the embedding process directly into the learning algorithm itself.

2.1 Creating embeddings with Node2Vec

2.1.1 Loading data, setting parameters, and creating embeddings

2.1.2 Demystifying embeddings

2.1.3 Transforming and visualizing the embeddings

2.1.4 Beyond visualization: Applications and considerations of N2V embeddings

2.2 Creating embeddings with a GNN

2.2.1 Constructing the embeddings

2.2.2 GNN vs. N2V embeddings

2.3 Using node embeddings

2.3.1 Data preprocessing

2.3.2 Random forest classification

2.3.3 Embeddings in an end-to-end model

2.4 Under the Hood

2.4.1 Representations and embeddings

2.4.2 Transductive and inductive methods

2.4.3 N2V: Random walks across graphs

2.4.4 Message passing as deep learning

Summary