16 Gradient Descent
Optimization Problems (not just) on Graphs

This chapter covers

Developing a randomized heuristic to find the minimum crossing number
Introducing cost functions to show how the heuristic works
Explaining gradient descent and implementing a generic version
Discussing strengths and pitfalls of gradient descent
Applying gradient descent to the graph embedding problem

If I mention a technique called gradient descent, does it ring a bell? You might not have heard of, or maybe you recognize the name but can’t quite recall how it works? If so, that’s fine. If, however, I ask you about machine learning, classification problems or neural networks, chances are that you know exactly what I’m talking about: and I bet these terms also sparkled your interest much more.

Well, you see, gradient descent (or a variation on the theme) is the optimization technique that is used behind the scenes by many machine learning algorithms like the one mentioned above. But did you know that, long before being used as the back-bone of supervised learning, this technique was designed to solve optimization problems, like some of the ones we have seen on graphs?

If you didn’t, or if you’d like to delve into this topic and learn more about how gradient descent works, this should be the perfect chapter for you.

16.1 Heuristics for Crossing Number

16.1.1 Did You Just Say Heuristics?

16.1.2 Extending to Curve-Line Edges

16.2 How Optimization Works

16.2.1 Cost Functions

16.2.2 Step Functions and Local Minima

16.2.3 Optimizing Random Sampling

16.3 Gradient Descent

16.3.1 The Math of Gradient Descent

16.3.2 Geometrical Interpretation

16.3.3 When is Gradient Descent Appliable?

16.3.4 Problems with Gradient Descent

16.4 Applications of Gradient Descent

16.4.1 An Example: Linear Regression

16.5 Gradient Descent for Graph Embedding

16.5.1 A Different Criterion

16.5.2 Implementation

16.6 Summary

16 Gradient Descent Optimization Problems (not just) on Graphs