Chapter 5. Should you question an invoice sent by a supplier?

 

This chapter covers

  • What’s the real question you’re trying to answer?
  • A machine learning scenario without trained data
  • The difference between supervised and unsupervised machine learning
  • Taking a deep dive into anomaly detection
  • Using the Random Cut Forest algorithm

Brett works as a lawyer for a large bank. He is responsible for checking that the law firms hired by the bank bill the bank correctly. How tough can this be, you ask? Pretty tough is the answer. Last year, Brett’s bank used hundreds of different firms across thousands of different legal matters, and each invoice submitted by a firm contains dozens or hundreds of lines. Tracking this using spreadsheets is a nightmare.

In this chapter, you’ll use SageMaker and the Random Cut Forest algorithm to create a model that highlights the invoice lines that Brett should query with a law firm. Brett can then apply this process to every invoice to keep the lawyers working for his bank on their toes, saving the bank hundreds of thousands of dollars per year. Off we go!

5.1. What are you making decisions about?

 
 
 

5.2. The process flow

 

5.3. Preparing the dataset

 
 

5.4. What are anomalies

 

5.5. Supervised vs. unsupervised machine learning

 

5.6. What is Random Cut Forest and how does it work?

 

5.7. Getting ready to build the model

 

5.8. Building the model

 
 

5.9. Deleting the endpoint and shutting down your notebook instance

 
 
 

5.10. Checking to make sure the endpoint is deleted

 
 

Summary

 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage