9 Node embeddings and classification

 

This chapter covers

  • Introducing node embedding models
  • Presenting the difference between transductive and inductive models
  • Examining the difference between structural roles and homophily-based embeddings
  • Introducing the node2vec algorithm
  • Using node2vec embeddings in a downstream machine learning task

In the previous chapter, you used a vector to represent each node in the network. The vectors were handcrafted based on the features you deemed essential. In this chapter, you will learn how to automatically generate node representation vectors using a node embedding model. Node embedding models fall under the dimensionality reduction category.

An example of feature engineering and dimensionality reduction is the body mass index (BMI). BMI is commonly used to define obesity. To precisely characterize obesity, you could look at a person’s height and weight, and measure their fat percentage, muscle content, and waist circumference. In this case, you would be dealing with five input features to predict obesity. Instead of having to measure all five features before an observation can be made, the doctors came up with a BMI.

9.1 Node embedding models

9.1.1 Homophily vs. structural roles approach

9.1.2 Inductive vs. transductive embedding models

9.2 Node classification task

9.2.1 Defining a connection to a Neo4j database

9.2.2 Importing a Twitch dataset

9.3 The node2vec algorithm

9.3.1 The word2vec algorithm

9.3.2 Random walks

9.3.3 Calculate node2vec embeddings

9.3.4 Evaluating node embeddings

9.3.5 Training a classification model

9.3.6 Evaluating predictions