This chapter covers:
- What is the naive Bayes algorithm?
- What is the support vector machine algorithm?
- Building a naive Bayes model to predict political party based on votes
- Building a support vector machine model to classify spam emails
- Tuning many hyperparameters simultaneously with a random search
The naive Bayes and support vector machine (SVM) algorithms are supervised learning algorithms for classification. Each algorithm learns in a different way. The naive Bayes algorithm uses Bayes' rule that you learned about in chapter 5, to estimate the probability of new data belonging to one of the classes in the dataset. The case is then assigned to the class with the highest probability. The SVM algorithm looks for a hyperplane (a surface that has one fewer dimensions then there are predictor variables) that separates the classes. The position and direction of this hyperplane depends on support vectors, cases that lie closest to the boundary between the classes.
Note
While commonly used for classification, the SVM algorithm can also be used for regression problems. I won’t discuss how here, but if you’re interested (and want to explore SVMs in more depth generally), see the book Support Vector Machines, by Andreas Christmann and Ingo Steinwart (Springer, 2008).