10 Consistency and atomicity in distributed systems

 

This chapter covers

  • Traffic flow between microservices deployed to N nodes and a distributed database
  • Applications that work correctly in a one-node scenario and evolving them to work properly on N nodes
  • Differences between atomicity and consistency in your application’s environment

If we want our application to scale and run in a distributed environment, we need to design our code for that. Having a consistent view of the system is important and relatively easy to achieve if our application is deployed to one node and uses a standard database that works in a primary-secondary architecture. In such a context, a database transaction guarantees the atomicity of our operations. However, in reality, our applications need to be scalable and elastic.

Depending on our traffic patterns, we want to be able to deploy our services to N nodes. Once we deploy the application to N nodes, we may notice scalability problems on the lower layer—the database. In such a case, there is often a need to migrate the data layer to a distributed database. By doing so, we are able to distribute the incoming traffic handling to N microservices, which, in turn, distribute the traffic to M database nodes. In such an environment, our code needs to be designed in an entirely different way. This chapter focuses on the decisions and changes we need to perform to make our application logic consistent and atomic in such a distributed environment.

10.1 At-least-once delivery of data sources

10.1.1 Traffic between one-node services

10.1.2 Retrying an application’s call

10.1.3 Producing data and idempotency

10.1.4 Understanding Command Query Responsibility Segregation (CQRS)

10.2 A naive implementation of a deduplication library

10.3 Common mistakes when implementing deduplication in distributed systems

10.3.1 One node context

10.3.2 Multiple nodes context

10.4 Making your logic atomic to prevent race conditions

Summary