chapter three

3 Schema Registry

This chapter covers

Using bytes means serialization rules
What a schema is and why you need to use one
What Schema Registry is
Ensuring compatibility with changes—schema evolution
Understanding subject names
Reusing schemas with references

In chapter 2, you learned about the heart of the Kafka streaming platform, the Kafka broker. In particular, you learned how the broker is the storage layer appending incoming messages to a topic, serving as an immutable, distributed log of events. A topic represents the directory containing the log file(s).

Since the producers send messages over the network, they must be serialized first into binary format, an array of bytes. The Kafka broker does not change the messages in any way; it stores them in the same format. It’s the same when the broker responds to fetch requests from consumers; it retrieves the already serialized messages and sends them over the network.

By only working with messages as arrays of bytes, the broker is entirely agnostic to the data type the messages represent and utterly independent of the applications producing and consuming the messages and the programming languages those applications use. Decoupling the broker from the data format permits any client using the Kafka protocol to produce or consume messages.

3.1 Objects

3.2 What is a schema, and why do you need one?

3.2.1 What is Schema Registry?

3.2.2 Getting Schema Registry

3.2.3 Architecture

3.2.4 Communication: Using Schema Registry’s REST API

3.2.5 Registering a schema

3.2.6 Plugins and serialization platform tools

3.2.7 Uploading a schema file

3.2.8 Generating code from schemas

3.2.9 End-to-end example

3.3 Subject name strategies

3.3.1 TopicNameStrategy

3.3.2 RecordNameStrategy

3.3.3 TopicRecordNameStrategy

3.4 Schema compatibility

3.4.1 Backward compatibility

3.4.2 Forward compatibility

3.4.3 Full compatibility