1 Introduction to Kafka

This chapter covers:

Introducing why you might want to use Kafka
Common myths in relation to Big Data and message systems
Understanding Real World Use Cases where Kafka helps power messaging, streaming, and IoT data processing

As many companies are facing a world full of data being produced from every angle, they are often presented with the fact that legacy systems might not be the best option moving forward. One of the foundational pieces of new data infrastructures that has taken over the IT landscape has been Apache Kafka®. Kafka is changing the standards for data platforms. It is leading the way to move from extract, transform, load (ETL) and batch workflows, in which work was often held and processed in bulk at one various pre-defined time of day, to near real-time data feed ^[1]. Batch processing, which was once the standard workhorse of enterprise data processing, might not be something to turn back to after seeing the powerful feature set that Kafka provides. In fact, it might not be able to handle the growing snowball of data rolling toward enterprises of all sizes unless something new is approached. With so much data, systems can get easily overloaded with data. Legacy systems might be faced with nightly processing windows that run into the next day. To keep up with this ever-constant stream of data, some with evolving data, processing this stream of information as it happens is a way to keep up-to-date and current on the system’s state.

1.1 What is Kafka?

1.2 Kafka Usage

1.2.1 Kafka for the Developer

1.2.2 Explaining Kafka to your manager

1.3 Kafka Myths

1.3.1 Kafka only works with Hadoop

1.3.2 Kafka is the same as other message brokers

1.4 Kafka in the Real World

1.4.1 Early Examples

1.4.2 Later Examples

1.4.3 When Kafka might not be the right fit

1.5 Online resources to get started

1.6 Summary