Chapter 5. Moving from local to remote topologies
This chapter covers
- The Storm cluster
- Fault tolerance within a Storm cluster
- Storm cluster installation
- Deploying and running topologies on a Storm cluster
- The Storm UI and the role it plays
Imagine the following scenario. You’re tasked with implementing a Storm topology for performing real-time analysis on events logged within your company’s system. As a conscientious developer, you’ve decided to use this book as a guideline for developing the topology. You’ve built it using the core Storm components covered in chapter 2. You’ve applied the topology design patterns you learned about in chapter 3 while determining what logic should go into each bolt, and you’ve followed the steps in chapter 4 to provide at-least-once processing for all tuples coming into your topology. You’re ready to hook the topology up to a queue receiving logging events and have it hum along. What do you do next?
You can run your topology locally as in chapters 2, 3, and 4, but doing so won’t scale to the data volume and velocity that you’re expecting. You need to be able to deploy your topology to an environment that’s built for handling production-level data. This is where the “remote” (also known as “production”) Storm cluster comes into play—an environment built to handle the demands of production-level data.