9 Proximity-based algorithms

 

This chapter covers

  • Using advanced algorithms to fight fraud based on anomaly detection
  • Using graphs for storing and analyzing the k-NN of transactions
  • Identifying transactions that are anomalous

Chapter 8 introduced fraud detection techniques by showing two approaches based on identifying relationships that are explicit in the data. In the first case, each transaction connected the cardholder to the merchant where the card was used. In the second case, bank or credit card accounts were connected by the owner’s personal or access details (phone number, address, IP, and so on). But in most cases, such relationships are not explicit, and in these circumstances, we need to do more work to infer or discover connections or relationships between data items to detect and combat fraud.

This chapter explores advanced algorithms for fighting fraud, borrowed from anomaly detection theory, that are capable of recognizing anomalous items in large transactional datasets in which the data points appear to be independent. As I touched on in chapter 8, anomaly detection is the branch of data mining concerned with discovering rare occurrences, or outliers, in datasets. When you’re analyzing large and complex datasets, determining what stands out in the data is often at least as important and interesting as learning about its general structure.

9.1 Proximity-based algorithms: An introduction

 
 

9.2 Distance-based approach

 
 

9.2.1 Storing transactions as a graph

 
 
 
 

9.2.2 Creating the k-nearest neighbors graph

 
 
 

9.2.3 Identifying fraudulent transactions

 
 

9.2.4 Advantages of the graph approach

 
 
 

Summary

 
 

References

 
 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage