Churn is when customers stop using the services of a company. Thus, churn prediction is about identifying customers who are likely to cancel their contracts soon. If the company can do that, it can offer discounts on these services in an effort to keep the users.
Naturally, we can use machine learning for that: we can use past data about customers who churned and, based on that, create a model for identifying present customers who are about to leave. This is a binary classification problem. The target variable that we want to predict is categorical and has only two possible outcomes: churn or not churn.
In chapter 1, we learned that many supervised machine learning models exist, and we specifically mentioned ones that can be used for binary classification, including logistic regression, decision trees, and neural networks. In this chapter, we start with the simplest one: logistic regression. Even though it’s indeed the simplest, it’s still powerful and has many advantages over other models: it’s fast and easy to understand, and its results are easy to interpret. It’s a workhorse of machine learning and the most widely used model in the industry.