Chapter 8. Strategies for fault tolerance and monitoring

 

This chapter covers

  • What is latency?
  • Why do microservices need to be fault tolerant?
  • How do circuit breakers work?
  • What tools can mitigate against distributed failure?

You’ll use the example from the previous chapters to expand the functionality of Stripe and Payment to include fault mitigation as you explore the concepts of fault tolerance and monitoring. Fault tolerance is especially important when your Payment microservice is communicating over a network to external systems. You need to expect failures and time-outs when communicating across networks.

8.1. Microservice failures in a distributed architecture

Figure 8.1 revisits what your distributed architecture for microservices looks like.

Figure 8.1. Microservices in a distributed architecture

How is this distributed architecture relevant to failures? By virtue of your microservices containing smaller chunks of business logic, as opposed to a monolith that contains everything, you end up with a significantly larger number of services to maintain. You’re no longer dealing with a UI that might communicate with a single backend service that handles all its needs. More likely, that same UI is now integrating with dozens of microservices, or more, that need to be just as reliable as your previous monolith.

8.2. Network failures

 

8.3. Mitigating against failures

 
 
 

8.4. Adding Hystrix to your Payment microservice

 

Summary

 
 
 
 
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
test yourself with a liveTest