Chapter 9. Analyzing Meetup RSVPs in real time
This chapter covers
- Building a complete streaming data pipeline
- Planning to take it to production
Congratulations! You have reached the chapter where we are going to take all the material you have read and put it to use. In this chapter we will build a complete streaming data pipeline and an application that consumes the stream. Instead of using a fictitious data set (and leaving you wondering how this works in the wild) we’ll use a live data set—the Meetup (www.meetup.com/meetup_api/docs/stream/2/rsvps/#websockets) Streaming RSVP API—as the data source for our pipeline. The web application we build at the end of the chapter will allow us to glean insight from the RSVP stream. To aid in debugging, and in case the data source is no longer available, along with the code for this chapter you’ll find a sample data file that you can use to simulate the stream of data. After you complete this chapter you will have a fully functional streaming pipeline and web application. With it you’ll be in a good position to take it to the next level with this data set or a totally different data set.
Before we embark on our journey of building, I want to talk about two things. It would be impossible to cover every incarnation of technical choices made along the way. It is implausible to implement in one chapter everything we have discussed in the previous eight chapters.