9 Supervised machine learning with Random Forest and XGBoost
This chapter covers
- Introducing supervised machine learning (ML) and how it relates to threat hunting
- Applying supervised ML for threat hunting
- The importance of training data sets in supervised ML
- Acquiring and processing reliable training data sets
- Practicing threat hunting with supervised ML
- Evaluating and comparing supervised ML models
- Comparing of supervised and unsupervised ML
Chapter 8 introduced unsupervised ML and used a k-means clustering model to group similar data points. Investigating events mapped to the small clusters led us to uncover malicious activities. In this chapter, we introduce supervised ML and compare it with unsupervised ML in the context of threat hunting. We identify the prerequisites of operating supervised ML effectively, some of which translate into operation challenges that threat hunters should be aware of.