chapter five

Chapter 5. Moving from local to remote topologies

This chapter covers

The Storm cluster
Fault tolerance within a Storm cluster
Storm cluster installation
Deploying and running topologies on a Storm cluster
The Storm UI and the role it plays

Imagine the following scenario. You’re tasked with implementing a Storm topology for performing real-time analysis on events logged within your company’s system. As a conscientious developer, you’ve decided to use this book as a guideline for developing the topology. You’ve built it using the core Storm components covered in chapter 2. You’ve applied the topology design patterns you learned about in chapter 3 while determining what logic should go into each bolt, and you’ve followed the steps in chapter 4 to provide at-least-once processing for all tuples coming into your topology. You’re ready to hook the topology up to a queue receiving logging events and have it hum along. What do you do next?

You can run your topology locally as in chapters 2, 3, and 4, but doing so won’t scale to the data volume and velocity that you’re expecting. You need to be able to deploy your topology to an environment that’s built for handling production-level data. This is where the “remote” (also known as “production”) Storm cluster comes into play—an environment built to handle the demands of production-level data.

Chapter 5. Moving from local to remote topologies

This chapter covers

5.1. The Storm cluster

5.2. Fail-fast philosophy for fault tolerance within a Storm cluster

5.3. Installing a Storm cluster

5.4. Getting your topology to run on a Storm cluster

5.5. The Storm UI and its role in the Storm cluster

5.6. Summary