chapter eleven

11 Finding boundaries with style: Support vector machines and the kernel method

In this chapter

what a support vector machine is
which of the linear classifiers for a dataset has the best boundary
using the kernel method to build nonlinear classifiers
coding support vector machines and the kernel method in Scikit-Learn

Experts recommend the kernel method when attempting to separate chicken datasets.

In this chapter, we discuss a powerful classification model called the support vector machine (SVM for short). An SVM is similar to a perceptron, in that it separates a dataset with two classes using a linear boundary. However, the SVM aims to find the linear boundary that is located as far as possible from the points in the dataset. We also cover the kernel method, which is useful when used in conjunction with an SVM, and it can help classify datasets using highly nonlinear boundaries.

In chapter 5, we learned about linear classifiers, or perceptrons. With two-dimensional data, these are defined by a line that separates a dataset consisting of points with two labels. However, we may have noticed that many different lines can separate a dataset, and this raises the following question: how do we know which is the best line? In figure 11.1, we can see three different linear classifiers that separate this dataset. Which one do you prefer, classifier 1, 2, or 3?

11 Finding boundaries with style: Support vector machines and the kernel method

In this chapter

Experts recommend the kernel method when attempting to separate chicken datasets.

Figure 11.1 Three classifiers that classify our data set correctly. Which should we prefer, classifier 1, 2, or 3?

Using a new error function to build better classifiers

Coding support vector machines in Scikit-Learn

Training SVMs with nonlinear boundaries: The kernel method

Summary

Exercises