3 Graph Embeddings

This chapter covers

Understanding graph embeddings and their limitations
Using transductive and inductive techniques to create node embeddings
Taking the example dataset, introduced in Chapter 2 and the Appendix A, to creating node embeddings

In the previous chapter, we outlined the creation of a social graph created by a recruiting firm. Nodes are job candidates, and edges represent relationships between job candidates. We generated graph data from raw data, in the form of edge lists and adjacency lists. We then used that data in a graph processing framework (NetworkX) and a GNN library (Pytorch Geometric). The nodes in this data included the candidate’s ID, job type (accountant, engineer, etc), and industry (banking, retail, tech, etc).

We’re next going to discuss how to take this graph and perform graph embeddings. Graph embeddings low-dimensional representations that can be generated for entire graphs, sub-graphs, nodes, and edges. They are central to graph-based learning and can be generated in many different ways, including with graph algorithms, linear algebra methods, and GNNs.

GNNs are special because embedding is inherent to their architectures. Previously when there was a need for a graph embedding in a machine learning application, the embedding and the model training were done with separate processes. With GNNs, embedding and model training are performed simultaneously when training the model.

3.1 Graph Representations II

3.1.1 Overview of Embeddings

3.2 Transductive Embedding Technique: Node2Vec

3.2.1 Node Similarity or Context

3.2.2 Random walks across graphs

3.2.3 Optimization

3.2.4 Implementations and Uses of Node2Vec

3.3 Inductive Embedding Technique: GNN

3.3.1 Traits of Inductive Embeddings

3.3.2 Deep Learning and GNNs

3.3.3 Using Pytorch Geometric

3.3.4 Our Process/Pipeline

3.4 Summary

3.5 References