This chapter covers:
- Creating a Stream Analytics service
- Configuring inputs and outputs
- Choosing the number of Streaming Units
- Writing queries using window functions
- Writing queries for parallel processing
In previous chapters you’ve seen examples of prep work for batch processing, loading files into storage, and saving groups of messages into files. Azure Storage accounts, Data Lake, and Event Hubs services set the base for building a batch processing analytics system in Azure. In this chapter, you’re going to see how these services support stream processing too.
Stream processing covers the execution of an operation on a piece of data in an endless sequence. Stream processing also covers executions on multiple pieces of data in a time-ordered sequence. These two approaches to data processing are called one-at-a-time or real-time stream processing, and micro-batch processing.
Figure 6.1. Data stream with one-at-a-time and micro-batch queries
Figure 6.1 shows two queries processing a stream of data. One query checks every new data item and returns an output for each match. The other query counts the number of data items submitted during a repeating time frame. The data is organized by time. Data files from Azure Storage and messages from ingestion services like Event Hubs can both feed into stream processors. In this chapter, you’ll learn about a new Azure service, Stream Analytics.