chapter eleven

11 Finding boundaries with style: Support vector machines and the kernel method

 

This chapter covers

  • What is a support vector machine?
  • What does it mean for a linear classifier to fit well between the points?
  • A new linear classifier which consists of two lines, and its new error function.
  • Tradeoff between good classification and a good fit: the C parameter.
  • Using the kernel method to build non-linear classifiers.
  • Types of kernels: polynomial and radial basis function (rbf) kernel.
  • Coding SVMs and the kernel method in sklearn.

In this chapter I teach you support vector machines, which are a very powerful classification algorithm. A support vector machine not only aims to find a separation boundary, it aims to find the one that is the farthest away from the points. I will also teach you the kernel method, which is a very useful method to classify boundaries using highly non-linear regions.

In Chapters 4 and 5, we learned about linear classifiers. In two dimensions, these are simply defined by a line that best separates a dataset of points with two labels. However, you may have noticed that many different lines can separate a dataset, and this raises the question: How do you know which is the best line? In figure 11.1 I show you three different classifiers that separate this dataset. Which one do you prefer, classifier 1, 2, or 3?

Figure 11.1. Three classifiers that classify our data set correctly. Which one do we prefer, Classifier 1, 2, or 3?

11.1  Using a new error function to build better classifiers

11.1.1    Classification error function - trying to classify the points correctly

11.1.2    Distance error function - trying to space our two lines as far apart as possible

11.1.3    Adding the two error functions to obtain the error function

11.1.4    Using a dial to decide how we want our model: The C parameter

11.2  Coding support vector machines in sklearn

11.2.1    Coding a simple SVM

11.2.2    Introducing the C parameter

11.3  Going from lines to circles, parabolas, etc. - The kernel method

11.3.1    Using polynomial equations (circles, parabolas, hyperbolas, etc.) to our benefit - The polynomial kernel

11.3.2    Using bumps in higher dimensions to our benefit - The radial basis function (rbf) kernel

11.3.3    Training an SVM with the rbf kernel

11.3.4    Coding the kernel method