This chapter covers
- Learning about the streaming data pipeline model and its distributed framework
- Determining where streaming data applications and the data stream model meet
- Identifying where algorithms and data structures fit in data streams
- Setting up basic computing constraints and concepts inherent to data streams
- Giving some probabilistic background for the next two chapters to follow
Previous chapters introduced a number of algorithms/data structures for sketching (an important characteristic) huge amounts of data residing in a database or, as you saw in the application of the HyperLogLog in network traffic surveillance, arriving and expiring at a lightning rate. In this chapter, we will round up these algorithms.