7 Inferring co-occurrence networks based off bipartite networks

 

This chapter covers

  • Extracting hashtags from tweets with Cypher query language
  • Calculating Jaccard similarity coefficient
  • Constructing and analyzing monopartite networks using Jaccard similarity coefficient
  • Using Label Propagation algorithm to evaluate community structure of a network
  • Using PageRank to find the most important node within a community

In the previous chapter, you learned how to transform a custom graph pattern into direct relationships to use them as an input to graph algorithms like PageRank. In this chapter, you will focus on bipartite networks and how to project them into monopartite networks. First, a quick refresher of what a bipartite network is.

Figure 7.1. Bipartite network of tweets and hashtags.
CH07 F01 bipartite

A bipartite network contains two sets or types of nodes. For example, Figure 7.1 visualizes the bipartite network of tweets on the left and their hashtags on the right. As you can observe, the relationships always points from one type of nodes to another. There are no direct connections between tweets or hashtags.

7.1 Extracting hashtags from tweets

7.2 Constructing the co-occurrence network

7.2.1 Jaccard similarity coefficient

7.2.2 Node Similarity algorithm

7.3 Characterization of the co-occurence network

7.3.1 Node degree centrality

7.3.2 Weakly-connected components

7.4 Community detection with Label Propagation algorithm

7.5 Identifying community representatives with PageRank

7.5.1 Drop projected in-memory graphs

7.6 Summary

7.7 References

7.8 Solutions to exercises