9 Crimes against data statistics done wrong
This chapter covers
- Common statistics mistakes and why they happen
- Cognitive and data biases and how to mitigate them
- Identifying and mitigating p-hacking, which is cherry-picking models and data.
Data, statistics, analytics, and machine learning can be easily misused. It is easy to approach data incorrectly with the wrong objective and take advantage of the flexibility modeling can offer. This is often not malicious, but rather a result of human nature. We want things to work, and we want to troubleshoot, but this can quickly slip into making the abstract ideas of statistics bend to our will.
This chapter will show you how to identify and prevent common mistakes in statistics, including the hacking of p-values
The importance of reading past headlines
It’s common for news stories to not have critical context in their headline. A catchy headline may get attention, but it buries important details in the article that, at best, marginalize the headline, and at worst contradict it. In 2024, I encountered a headline from the New York Post. It was titled Celebrity-loved diet linked to higher risk of heart disease death1.