chapter four

4 Testing what we assume to know: Neyman, Pearson, and the principles of hypothesis testing

This chapter covers

Jerzy Neyman and Egon Pearson’s On the Problem of the Most Efficient Tests of Statistical Hypotheses (1933) and their stepwise hypothesis testing procedure
Prevailing best practices in hypothesis testing—from pre-specified hypotheses to principled conclusions—grounded in the Neyman-Pearson framework
How hypothesis testing separates real signals from random noise to prevent false conclusions and misguided actions
The cost of being wrong—why false positives and false negatives demand deliberate trade-offs in science, business, and AI
How hypothesis testing anchors modern workflows in statistics, data science, and machine learning, from A/B testing to model evaluation

By the early 1930s, the foundations of modern inference were already taking shape. Bayes had shown how belief could be updated in light of new evidence, grounding inference in probability. Fisher had introduced powerful tools for estimation and likelihood, along with the idea of significance testing. Yet something essential was still missing. Neither framework fully resolved how uncertainty should translate into action: how to decide when evidence is strong enough, how to weigh different kinds of error, and how to design procedures that behave reliably when applied again and again.

4.1 The stepwise framework of hypothesis testing

4.1.1 Step 1: state the hypothesis

4.1.2 Step 2: choose a significance level

4.1.3 Step 3: select the test

4.1.4 Step 4: compute the test statistic

4.1.5 Step 5: define the critical region

4.1.6 Step 6: make a decision

4.1.7 Step 7: draw a conclusion

4.1.8 From framework to practice

4.2 Hypothesis testing in action

4.2.1 Example 1: medical diagnostics

4.2.2 Example 2: safety monitoring

4.2.3 Example 3: a small clinical trial

4.2.4 From examples to principles

4.3 Why It Matters

4.3.1 Guarding against false discoveries

4.3.2 Turning uncertainty into strategy

4.3.3 Finding signal in the noise

4.4 Applications in statistics, data science, and AI

4.5 Summary