11 Delivery semantics in distributed systems

 

This chapter covers

  • Publish-subscribe and producer-consumer models in data-intensive applications
  • Delivery guarantees and their impact on resilience and fault tolerance
  • Building fault-tolerant systems leveraging delivery semantics

In the previous chapter, we learned about fault tolerance, retries, and idempotence of operations in the context of a relatively straightforward system architecture. In real life, our systems consist of multiple components responsible for different parts of our business domain and infrastructure. For example, we may have a service that is responsible for collecting metrics. Another service may be responsible for collection logs and so on. Besides that, we need applications that provide the primary business use cases of our domain. This can be a payment service or a database that is responsible for persistence. In those architectures, services need to connect with each other to be able to exchange information.

The more components our system has, the more points where failure can occur. Every network request can fail, and we need to decide if an action should be retried or not. If we want to create a fault-tolerant architecture, we need to build handling failure into the system. Then, every component needs to provide precise delivery semantics when producing the data. On the other hand, consumption of data should also follow expected delivery semantics.

11.1 Architecture of event-driven applications

11.2 Producer and consumer applications based on Apache Kafka

11.2.1 Looking at the Kafka consumer side

11.2.2 Understanding the Kafka brokers setup

11.3 The producer logic

11.3.1 Choosing consistency vs. availability for the producer

11.4 Consumer code and different delivery semantics

11.4.1 Committing a consumer manually

11.4.2 Restarting from the earliest or latest offsets

11.4.3 (Effectively) exactly-once semantic

11.5 Leveraging delivery guarantees to provide fault tolerance

Summary