Chapter 1. Analytics-on-write

 

Chapter 11 from Event Streams in Action by Alexander Dean and Valentin Crettaz

This chapter covers

  • Simple algorithms for analytics-on-write on event streams
  • Modeling operational reporting as a DynamoDB table
  • Writing an AWS Lambda function for analytics-on-write
  • Deploying and testing an AWS Lambda function

In the previous chapter, we implemented a simple analytics-on-read strategy for OOPS, our fictitious package-delivery company, using Amazon Redshift. The focus was on storing our event stream in Redshift in such a way as to support as many analyses as possible “after the fact.” We modeled a fat event table, widened it further with dimension lookups for key entities including drivers and trucks, and then tried out a few analyses on the data in SQL.

For the purposes of this chapter, we will assume that some time has passed at OOPS, during which the BI team has grown comfortable with writing SQL queries against the OOPS event stream as stored in Redshift. Meanwhile, rumblings are coming from various stakeholders at OOPS who want to see analyses that are not well suited to Redshift. For example, they are interested in the following:

  • Low-latency operational reporting This must be fed from the incoming event streams in as close to real time as possible.
  • Dashboards to support thousands of simultaneous users For example, a parcel tracker on the website for OOPS customers.

11.1. Back to OOPS

11.2. Building our Lambda function

11.3. Running our Lambda function

Summary