Chapter 6. Tuning in Storm

This chapter covers

Tuning a Storm topology
Handling latency in a Storm topology
Using Storm’s built-in metrics-collecting API

So far, we’ve given you as gentle an introduction to Storm concepts as we can. It’s time to kick things up a notch. In this chapter, we’ll discuss life as a developer after you’ve deployed your topology to a Storm cluster. You thought your job was over once the topology was deployed, didn’t you? Think again! Once you deploy your topology, you need to make sure it’s running as efficiently as possible. That’s why we’ve devoted two chapters to tuning and troubleshooting.

We’ll briefly revisit the Storm UI, because this will be the most important tool you’ll use to determine whether your topology is running efficiently. Then we’ll outline a repeatable process you can use in order to identify bottlenecks and resolve those bottlenecks. Our lesson on tuning doesn’t end there—we still need to discuss one of the greatest enemies of fast code: latency. We’ll conclude by covering Storm’s metrics-collecting API as well as introduce a few custom metrics of our own. After all, knowing exactly what your topology is doing is an important part of understanding how to make it faster.

Chapter 6. Tuning in Storm

This chapter covers

6.1. Problem definition: Daily Deals! reborn

6.2. Initial implementation

6.3. Tuning: I wanna go fast

6.4. Latency: when external systems take their time

6.5. Storm’s metrics-collecting API

6.6. Summary