In the previous chapter, we learned about fault tolerance, retries, and idempotence of operations in the context of a relatively straightforward system architecture. In real life, our systems consist of multiple components responsible for different parts of our business domain and infrastructure. For example, we may have a service that is responsible for collecting metrics. Another service may be responsible for collection logs and so on. Besides that, we need applications that provide the primary business use cases of our domain. This can be a payment service or a database that is responsible for persistence. In those architectures, services need to connect with each other to be able to exchange information.
The more components our system has, the more points where failure can occur. Every network request can fail, and we need to decide if an action should be retried or not. If we want to create a fault-tolerant architecture, we need to build handling failure into the system. Then, every component needs to provide precise delivery semantics when producing the data. On the other hand, consumption of data should also follow expected delivery semantics.