2 Kafka Cluster Data Architecture
This chapter covers
- Organizing related messages through topics
- Utilizing partitions for parallel processing data
- The composition of Kafka messages: keys, values, and headers
- Using replication to ensure availability and fault tolerance
- Working with compacted topics for persistent data storage
The meeting room buzzed with anticipation as Max Sellington, Rob Routine, and Eva Catalyst gathered around the table, laptops open and minds focused on the task at hand – designing a proof of concept for their Customer 360 project with Apache Kafka. The idea is to pull together customer info from various sources and present it in a unified view. Today, they're diving into how to handle all this data on Kafka servers, with plans to tackle client applications later.
MAX (leaning forward): Alright team, let's dive into our proof of concept. Eva, as our data engineer, where do we begin when selecting topics for our Kafka setup?
EVA (nodding): Good question, Max. In Kafka, "topics" group related messages and serve as the destination where data is sent and stored. We need to pinpoint the key events we want to capture. Customer interactions, transactions, website visits – each could potentially become a topic.