chapter six

6 Detection theory

Let’s continue from chapter 3, where you are the data scientist building the loan approval model for the (fictional) peer-to-peer lender ThriveGuild. As then, you are in the first stage of the machine learning lifecycle, working with the problem owner to specify the goals and indicators of the system. You have already clarified that safety is important and that it is composed of two parts: basic performance (minimizing aleatoric uncertainty) and reliability (minimizing epistemic uncertainty). Now you want to go into greater depth in the problem specification for the first part: basic performance. (Reliability comes in part 4 of the book.)

What are the different quantitative metrics you could use in translating the problem-specific goals (e.g., expected profit for the peer-to-peer lender) to machine learning quantities? Once you’ve reached the modeling stage of the lifecycle, how would you know you have a good model? Do you have any special considerations when producing a model for risk assessment rather than simply offering an approve/deny output?

6.1 Selecting decision function metrics

6.1.1 Quantifying the possible events

6 Detection theory

6.1 Selecting decision function metrics

6.1.1 Quantifying the possible events

6.1.2 Summary performance metrics

6.1.3 Accounting for different operating points

6.2 The best that you can ever do

6.3 Risk assessment and calibration

6.4 Summary