chapter ten

10 Link prediction

This chapter covers

Discussing the link prediction workflow
Introducing link prediction dataset split
Constructing link prediction features based on node pairs
Training and evaluating a supervised link prediction classification model

Most real-world networks are dynamic and evolve through time. Take, for example, a friendship network of people. People’s friends change over time. They might meet new people or cease to associate with others. You might assume new connections form randomly in a friendship network; however, it turns out that most real-world networks have a profound organizing principle. The studies around link prediction are focused on identifying and understanding various network-evolving mechanisms and applying them to predict future links.

10.1 Link prediction workflow

10.2 Dataset split

10.2.1 Time-based split

10.2.2 Random split

10.2.3 Negative samples

10.3 Network feature engineering

10.3.1 Network distance

10.3.2 Preferential attachment

10.3.3 Common neighbors

10.3.4 Adamic-Adar index

10.3.5 Clustering coefficient of common neighbors

10.4 Link prediction classification model

10.4.1 Missing values

10.4.2 Training the model

10.4.3 Evaluating the model

10.5 Solutions to exercises

Summary