chapter nine

9 Node embeddings and classification

This chapter covers

Introducing node embedding models
Presenting the difference between transductive and inductive models
Examining the difference between structural roles and homophily-based embeddings
Introducing the node2vec algorithm
Using node2vec embeddings in a downstream machine learning task

In the previous chapter, you used a vector to represent each node in the network. The vectors were handcrafted based on the features you deemed essential. In this chapter, you will learn how to automatically generate node representation vectors using a node embedding model. Node embedding models fall under the dimensionality reduction category.

An example of feature engineering and dimensionality reduction is the body mass index (BMI). BMI is commonly used to define obesity. To precisely characterize obesity, you could look at a person’s height and weight, and measure their fat percentage, muscle content, and waist circumference. In this case, you would be dealing with five input features to predict obesity. Instead of having to measure all five features before an observation can be made, the doctors came up with a BMI.

9.1 Node embedding models

9.1.1 Homophily vs. structural roles approach

9.1.2 Inductive vs. transductive embedding models

9.2 Node classification task

9 Node embeddings and classification

This chapter covers

9.1 Node embedding models

9.1.1 Homophily vs. structural roles approach

9.1.2 Inductive vs. transductive embedding models

9.2 Node classification task

9.2.1 Defining a connection to a Neo4j database

9.2.2 Importing a Twitch dataset

9.3 The node2vec algorithm

9.3.1 The word2vec algorithm

9.3.2 Random walks

9.3.3 Calculate node2vec embeddings

9.3.4 Evaluating node embeddings

9.3.5 Training a classification model

9.3.6 Evaluating predictions