Chapter 11. Analytics-on-write
This chapter covers
- Simple algorithms for analytics-on-write on event streams
- Modeling operational reporting as a DynamoDB table
- Writing an AWS Lambda function for analytics-on-write
- Deploying and testing an AWS Lambda function
In the previous chapter, we implemented a simple analytics-on-read strategy for OOPS, our fictitious package-delivery company, using Amazon Redshift. The focus was on storing our event stream in Redshift in such a way as to support as many analyses as possible “after the fact.” We modeled a fat event table, widened it further with dimension lookups for key entities including drivers and trucks, and then tried out a few analyses on the data in SQL.
For the purposes of this chapter, we will assume that some time has passed at OOPS, during which the BI team has grown comfortable with writing SQL queries against the OOPS event stream as stored in Redshift. Meanwhile, rumblings are coming from various stakeholders at OOPS who want to see analyses that are not well suited to Redshift. For example, they are interested in the following:
- Low-latency operational reporting— This must be fed from the incoming event streams in as close to real time as possible.
- Dashboards to support thousands of simultaneous users— For example, a parcel tracker on the website for OOPS customers.