6 Credit card fraud detection using logistic regression

This chapter covers

Understanding credit card fraud with data
Building a logistic regression-based fraud detection model
Interpreting trained logistic regression model
Deploying an ML model as a service

By 2022, the number of Americans falling victim to credit card fraud had risen to over 151 million, representing approximately 65% of all American cardholders—an increase from 127 million individuals in 2021 (https://www.security.org/digital-safety/credit-card-fraud-report/). This rise in credit card fraud shows both the prevalence of, as well as a steady increase in credit card fraud cases. The median fraud charge from such attacks has also increased from USD$62 in 2021 to USD$79 in 2022. As we are increasingly using cards for payments and integrating card payment solutions with all kinds of businesses, it becomes a necessity to build robust credit card fraud detection systems.

Figure 6.1 Different types of credit card fraud.

Credit card fraud refers to any unauthorized use of a credit/debit card for fraudulent purposes. Credit card fraud takes different shapes and forms. Here are some broad types of credit card fraud:

6.1 Understanding credit card fraud with data

6.1.1 Loading and cleaning the credit card fraud dataset

6.1.2 Data analysis and feature extraction

6.2 Building credit card fraud detection model

6.2.1 Splitting dataset into train and test sets

6.2.2 Addressing class imbalance

6.2.3 Feature scaling

6.2.4 Choosing the correct model performance metric

6.2.5 Training logistic regression model

6.2.6 Evaluating trained model on the test set

6.3 Interpreting trained logistic regression model

6.4 Deploying fraud detection model as a service

6.4.1 Saving best model

6.4.2 Creating a Python script to run model inference

6.4.3 Building a Flask web app to serve model

6.5 Summary