2 Rule-based fraud detection: a phishing example

 

This chapter covers

  • Analyzing a fraud (phishing) dataset to develop rules
  • Building a rule-based phishing detection system
  • Evaluating the performance of the rule-based system
  • Creating an executable phishing detection Python program

Have you ever played Guess in 10? In this fun game, your opponent picks up a card that could be related to a theme such as animals or countries. You have to then ask your opponent 10 questions with yes/no answers and try to guess the animal or country in the end, questions such as, is this animal herbivorous? While asking questions, you are essentially trying to narrow down the possible options to maximize the success of your guess. Now, let us replace animals or countries with a financial transaction where you have to guess whether it is fraudulent. What questions would you ask of the transaction? Or in fraud terms, what rules would you create around the transaction to tell if it’s fraudulent?

2.1 Analyzing phishing fraud data

2.1.1 Analyzing binary features

2.1.2 Analyzing numerical features

2.2 Build and evaluate a rule-based phishing detection system

2.2.1 Rules-based on binary features

2.2.2 Rules built on numerical features

2.2.3 Hybrid rules

2.3 Develop a fraud detection executable program in Python

2.4 Summary