10 Link prediction

 

This chapter covers

  • Covering link prediction workflow
  • Introducing link prediction dataset split
  • Constructing link prediction features based on node pairs
  • Training and evaluating a supervised link prediction classification model

Most real-world networks are dynamic and evolve through time. Take, for example, a friendship network of people. People’s friends change over time. They might meet new people or cease to associate with others. You might assume that new connections are forming randomly in a friendship network. However, it turns out that most real-world networks have a profound organizing principle. The studies around link prediction are focused on identifying and understanding various network-evolving mechanisms and applying them to predict future links.

Figure 10.1. Link prediction.
CH10 F01 lp

10.2 Dataset split

10.2.1 Time-based split

10.2.2 Random split

10.2.3 Negative samples

10.3 Network feature engineering

10.3.1 Network distance

10.3.2 Preferential attachment

10.3.3 Common neighbors

10.3.4 Adamic-Adar index

10.3.5 Clustering coefficient of common neighbors

10.4.1 Missing values

10.4.2 Training the model

10.4.3 Evaluating the model

10.5 Summary

10.6 References

10.7 Solutions to exercises