Chapter 6. Other useful graph algorithms
This chapter covers
- Standard graph algorithms that GraphX doesn’t provide out of the box
- Shortest Paths on graphs with weighted edges
- The Traveling Salesman problem
- Minimum Spanning Trees
In chapter 5 you learned the foundational GraphX APIs that will enable you to write your own custom algorithms. But there’s no need for you to reinvent the wheel in those cases where the GraphX API already provides an implemented standard algorithm. There are some algorithms that have been historically associated with graphs for decades but are not in the GraphX API. This chapter describes some of those classic graph algorithms and discusses which situations they can be used in.
These classic graph algorithms were invented in the 1950s, long before Spark or any other sort of parallel computing. They are iterative in nature—for example, they add one edge at a time to the solution. GraphX’s Pregel API isn’t a good match because it operates on all the vertices simultaneously. The power of GraphX’s parallel processing is still being used, though, because each step in these algorithms involves some kind of graph-wide search. You’ll see how to use GraphX’s iterative Map/Reduce facilities (aggregateMessages() together with outerJoinVertices()) to implement and parallelize these algorithms that were originally designed for serial computation.