chapter three
3 Machine learning for classification
This chapter covers
- Doing exploratory data analysis for identifying important features
- Encoding categorical variables to use them in machine learning models
- Using logistic regression for classification
In this chapter, we are going to use machine learning to predict churn.
Churn is a process in which customers stop using the services of a company. Thus, churn prediction is about identifying customers who are likely to cancel their contracts soon. If the company can do that, it can offer discounts on these services and this way keep the users.
Naturally, we can use machine learning for that: we can use the past data about customers who churned and, based on that, create a model for identifying present customers who are about to go away. This is a binary classification problem. The target variable that we want to predict is categorical and has only two possible outcomes: churn or not churn.