chapter three

3 Safety

Imagine that you are a data scientist at the (fictional) peer-to-peer lender ThriveGuild. You are in the problem specification phase of the machine learning lifecycle for a system that evaluates and approves borrowers. The problem owners, diverse stakeholders, and you yourself want this system to be trustworthy and not cause harm to people. Everyone wants it to be safe. But what is harm and what is safety in the context of a machine learning system?

Safety can be defined in very domain-specific ways, like safe toys not having lead paint or small parts that pose choking hazards, safe neighborhoods having low rates of violent crime, and safe roads having a maximum curvature. But these definitions are not particularly useful in helping define safety for machine learning. Is there an even more basic definition of safety that could be extended to the machine learning context? In fact, there is! And it is based on the concepts of (1) harm, (2) aleatoric uncertainty and risk, and (3) epistemic uncertainty.^[1]

This chapter teaches you how to approach the problem specification phase of a trustworthy machine learning system from a safety perspective. Specifically, by defining safety as minimizing two different types of uncertainty, you can collaborate with problem owners to crisply specify safety requirements and objectives that you can then work towards in the later parts of the lifecycle.^[2] The chapter covers:

3.1 Grasping safety

3.2 Quantifying safety with different types of uncertainty

3.2.1 Sample spaces, outcomes, events, and their costs

3.2.2 Aleatoric uncertainty and probability

3.2.3 Epistemic uncertainty and possibility

3.3 Summary statistics of uncertainty

3.3.1 Expected value and variance

3.3.2 Information and entropy

3.3.3 Kullback-Leibler divergence and cross-entropy

3.3.4 Mutual information

3.4 Conditional probability

3.5 Independence and Bayesian networks

3.5.1 Statistical independence

3.5.2 Bayesian networks

3.5.3 Conclusion

3.6 Summary