3 Graph Embeddings
This chapter covers
- Understanding graph embeddings and their limitations
- Using transductive and inductive techniques to create node embeddings
- Taking the example dataset, introduced in Chapter 2 and the Appendix A, to creating node embeddings
In the previous chapter, we outlined the creation of a social graph created by a recruiting firm. Nodes are job candidates, and edges represent relationships between job candidates. We generated graph data from raw data, in the form of edge lists and adjacency lists. We then used that data in a graph processing framework (NetworkX) and a GNN library (Pytorch Geometric). The nodes in this data included the candidate’s ID, job type (accountant, engineer, etc), and industry (banking, retail, tech, etc).
We’re next going to discuss how to take this graph and perform graph embeddings. Graph embeddings low-dimensional representations that can be generated for entire graphs, sub-graphs, nodes, and edges. They are central to graph-based learning and can be generated in many different ways, including with graph algorithms, linear algebra methods, and GNNs.
GNNs are special because embedding is inherent to their architectures. Previously when there was a need for a graph embedding in a machine learning application, the embedding and the model training were done with separate processes. With GNNs, embedding and model training are performed simultaneously when training the model.