5 Delivery semantics

 

In this chapter

  • introducing delivery semantics and their impact
  • at-most-once delivery semantic
  • at-least-once delivery semantic
  • exactly-once delivery semanticdelivery semantics.

There’s never enough time to do it right, but there’s always enough time to do it over.

—Jack Bergman

Computers are pretty good at performing accurate calculations. However, when computers work together in a distributed system, like many streaming systems, accuracy becomes a little bit more (I mean, a lot more) complicated. Sometimes, we may not want 100% accuracy because other more important requirements need to be met. “Why would we want wrong answers?” you might ask. This is a great question, and it is the one that we need to ask when designing a streaming system. In this chapter, we are going to discuss an important topic related to accuracy in streaming systems: delivery semantics.

The latency requirement of the fraud detection system

In the previous chapter, the team built a credit card fraud detection system which can make a decision within 20 milliseconds for each transaction and store the result in a database. Now, let’s ask an important question when building any distributed system: what if any failure happens?

Revisit the fraud detection job

About accuracy

Partial result

A new streaming job to monitor system usage

The new system usage job

The requirements of the new system usage job

New concepts: (The number of) times delivered and times processed

New concept: Delivery semantics

Choosing the right semantics

At-most-once

The fraud detection job

At-least-once