3 Schema Registry
This chapter covers
- Using bytes means serialization rules
- What a schema is and why you need to use one
- What Schema Registry is
- Ensuring compatibility with changes—schema evolution
- Understanding subject names
- Reusing schemas with references
In chapter 2, you learned about the heart of the Kafka streaming platform, the Kafka broker. In particular, you learned how the broker is the storage layer appending incoming messages to a topic, serving as an immutable, distributed log of events. A topic represents the directory containing the log file(s).
Since the producers send messages over the network, they must be serialized first into binary format, an array of bytes. The Kafka broker does not change the messages in any way; it stores them in the same format. It’s the same when the broker responds to fetch requests from consumers; it retrieves the already serialized messages and sends them over the network.
By only working with messages as arrays of bytes, the broker is entirely agnostic to the data type the messages represent and utterly independent of the applications producing and consuming the messages and the programming languages those applications use. Decoupling the broker from the data format permits any client using the Kafka protocol to produce or consume messages.