Chapter 3. Topology design

This chapter covers

Decomposing a problem to fit Storm constructs
Working with unreliable data sources
Integrating with external services and data stores
Understanding parallelism within a Storm topology
Following best practices for topology design

In the previous chapter, we got our feet wet by building a simple topology that counts commits made to a GitHub project. We broke it down into Storm’s two primary components—spouts and bolts—but we didn’t concern ourselves with details as to why. This chapter expands on those basic Storm concepts by showing you how to think about modeling and designing solutions with Storm. You’ll learn strategies for problem analysis that can help you end up with a good design: a model for representing the workflow of the problem at hand.

In addition, it’s important that you learn how scalability (or parallelization of units of work) is built into Storm because that affects the approach that you’ll take with topology design. We’ll also explore strategies for gaining the most out of your topology in terms of speed.

After reading this chapter, not only will you be able to easily take apart a problem and see how it fits within Storm, but you’ll also be able to determine whether Storm is the right solution for tackling that problem. This chapter will give you a solid understanding of topology design so that you can envision solutions to big data problems.

Chapter 3. Topology design

This chapter covers

3.1. Approaching topology design

3.2. Problem definition: a social heat map

3.3. Precepts for mapping the solution to Storm

3.4. Initial implementation of the design

3.5. Scaling the topology

3.6. Topology design paradigms

3.7. Summary

Chapter 3. Topology design

This chapter covers

3.1. Approaching topology design

3.2. Problem definition: a social heat map

3.3. Precepts for mapping the solution to Storm

3.4. Initial implementation of the design

3.5. Scaling the topology

3.6. Topology design paradigms

3.7. Summary

Unable to load book!