chapter seven

7 Handling failure with grace

 

This chapter covers

  • Identifying the main categories of failures
  • Avoiding common failure-handling antipatterns
  • Logging errors effectively
  • Designing failure-handling strategies

Before diving deeper into our rosette of beautiful code, we need to step back and address something more transversal that touches every dimension: failure.

Because let’s face it, the code will inevitably fail at some point. Whether it’s unexpected data, an overlooked bug, or an external system suddenly taking the day off, something will go wrong. And it could get worse, snowballing into a complete system failure. That’s life. But that doesn’t mean you can’t be prepared.

Handling failures gracefully is neither trivial nor something that should be treated as an afterthought. It’s about ensuring system robustness, providing meaningful clues to trace the root cause, while preserving the clarity of the code’s story, and the sanity of your end users. Proper failure management must be an integral part of the design from the outset. In this chapter, we’ll explore how to design robust failure-handling strategies, avoiding common pitfalls, and ensuring your users never have to face mishandled errors again.

7.1 Categorizing failure

Not all failures are created equal. Some are part of the natural flow of programs, others stem from coding errors, infrastructure issues, and some are so critical that they can take the entire system down if not handled carefully.

7.1.1 Business logic errors

7.1.2 Technical issues

7.1.3 Programming errors

7.1.4 Fatal errors

7.2 Antipatterns in failure handling

7.2.1 Failing to defend the entry points

7.2.2 Lazy catch

7.2.3 Logging gone wrong

7.2.4 Flawed error handling

7.3 Summary