chapter six

6 Classifying based on probabilities and hyperplanes: naive Bayes and support vector machines

 

This chapter covers:

  • What is the naive Bayes algorithm?
  • What is the support vector machine algorithm?
  • Building a naive Bayes model to predict political party based on votes
  • Building a support vector machine model to classify spam emails
  • Tuning many hyperparameters simultaneously with a random search

The naive Bayes and support vector machine (SVM) algorithms are supervised learning algorithms for classification. Each algorithm learns in a different way. The naive Bayes algorithm uses Bayes' rule that you learned about in chapter 5, to estimate the probability of new data belonging to one of the classes in the dataset. The case is then assigned to the class with the highest probability. The SVM algorithm looks for a hyperplane (a surface that has one fewer dimensions then there are predictor variables) that separates the classes. The position and direction of this hyperplane depends on support vectors, cases that lie closest to the boundary between the classes.

Note

While commonly used for classification, the SVM algorithm can also be used for regression problems. I won’t discuss how here, but if you’re interested (and want to explore SVMs in more depth generally), see the book Support Vector Machines, by Andreas Christmann and Ingo Steinwart (Springer, 2008).

6.1  What is the naive Bayes algorithm?

6.1.1  Using naive Bayes for classification

6.1.2  How is the likelihood calculated for categorical and continuous predictors?

6.2  Building our first naive Bayes model

6.2.1  Loading and exploring the HouseVotes84 dataset

6.2.2  Plotting the data

6.2.3  Training the model

6.3  Strengths and weaknesses of naive Bayes

6.4  What is the support vector machine (SVM) algorithm?

6.4.1  SVMs for linearly-separable data

6.4.2  What if the classes aren’t fully separable?

6.4.3  SVMs for non-linearly separable data

6.4.4  Hyperparameters of the SVM algorithm

6.4.5  What if I have more than two classes?

6.5  Building our first SVM model

6.5.1  Loading and exploring the spam dataset