chapter eleven

11 Integrating external systems with Kafka Connect

This chapter covers

Integrating Kafka with data sources and sinks
Configuring connectors and workers for optimal data flow
Exploring the REST API for managing Kafka Connect
Creating and modifying connectors
Using Java Database Connectivity Source and Debezium connectors

In most cases, we don’t introduce Kafka independently of other systems in our company, but rather we want to connect systems such as databases and messaging systems to Kafka. Most of these use cases are quite similar. We want to transfer data from predefined database tables to specific topics or write data from certain topics into a file. Of course, we always have the option to manually write our own producers and consumers to move data to or from Kafka.

However, this is very time-consuming and often leads to error-prone systems that are also difficult to scale. Even with simple requirements, such as transferring data from one system to another, there are many special cases to consider. An alternative is to use Kafka Connect.

11.1 What is Kafka Connect?

11.2 Kafka Connect cluster: Distributed Mode

11.2.1 Configuring a Kafka Connect cluster

11.2.2 Creating a connector

11.2.3 Testing the connector

11.3 Scalability and fault tolerance of Kafka Connect

11.4 Worker configuration

11.5 The Kafka Connect REST API

11.5.1 Status of a Kafka Connect cluster

11.5.2 Creating, modifying, and deleting connectors

11.6 Connector configuration

11.6.1 General connector configuration

11.6.2 Error handling in Kafka Connect

11.7 Single message transformations

11.8 Kafka Connect example: JDBC Source Connector

11.8.1 Preparing the JDBC Source Connector

11.8.2 Configuring the JDBC Source Connector

11.8.3 Testing the JDBC Source Connector

11.9 Kafka Connect example: Change data capture connector