Chapter 2. Getting data from clients: data ingestion

 

This chapter covers

  • Learning about the collection tier
  • Understanding the data collection patterns
  • Taking the collection tier to the next level
  • Protecting against data loss

On to our first tier: the collection tier is our entry point for bringing data into our streaming system. Figure 2.1 shows a slightly modified version of our blueprint, with focus on the collection tier.

Figure 2.1. Architectural blueprint with emphasis on the collection tier

This tier is where data comes into the system and starts its journey; from here it will progress through the rest of the system. In the coming chapters we’ll follow the flow of data through each of the tiers. Your goal for this chapter is to learn about the collection tier. When you finish this chapter you will know about the collection patterns, how to scale, and how to improve the dependability of the tier via the application of fault-tolerance techniques.

2.1. Common interaction patterns

Regardless of the protocol used by a client to send data to the collection tier—or in certain cases the collection tier reaching out and pulling in the data—a limited number of interaction patterns are in use today. Even considering the protocols driving the emergence of the Internet of Everything, the interaction patterns fall into one of the following categories:

  • Request/response pattern
  • Publish/subscribe pattern
  • One-way pattern
  • Request/acknowledge pattern
  • Stream pattern

2.2. Scaling the interaction patterns

2.3. Fault tolerance

2.4. A dose of reality

2.5. Summary