Part 2 Securing, observing, and controlling your service’s network traffic

 

A single misbehaving service has the potential to take down your entire system. We’ve seen it time and time again: maybe a thread pool fills up, a database slows down, or a rare bug triggers and causes a service to spin out of control. How do we build resilience into our services to expect and correctly deal with these scenarios? How do we consistently monitor golden signals to detect failure situations? How do we secure the communication between services?

Istio helps solve these challenges. Chapters 4-9 look at handling traffic from ingress to deep within a call graph. How do load-balancing algorithms coupled with resilience strategies help the overall system stay available even in the face of service failures? How do you observe throughput, latency, saturation, and error rates for all of the services consistently in your architecture? Can you trace specific service calls to help pinpoint issues in the network? Can you write policies about which services can communicate and, when they do, verify that peers on both side of the connection are certain they are communicating with whom they think they are? All of these topics are covered in this part of the book.