6 Real-time queries with Azure Stream Analytics

This chapter covers

Creating a Stream Analytics service
Configuring inputs and outputs
Choosing the number of streaming units
Writing queries using window functions
Writing queries for parallel processing

In previous chapters you’ve seen examples of prep work for batch processing, loading files into storage, and saving groups of messages into files. Storage accounts, Data Lake, and Event Hubs set the base for building a batch processing analytics system in Azure. In this chapter, you’re going to see how these services support stream processing too.

Stream processing covers running an operation on individual pieces of data from an endless sequence, or on multiple pieces of data in a time-ordered sequence. These two approaches are called one-at-a-time or real-time stream processing and micro-batch processing.

Figure 6.1 Data stream with one-at-a-time and micro-batch queries

Figure 6.1 shows two queries processing a stream of data. One query checks every new data item and returns an output for each match. The other query counts how many items were submitted during a repeating time frame. The data is organized by time. Data in files from Azure Storage and messages from ingestion services like Event Hubs can both feed into stream processors. Stream processors generate results in real time rather than on demand. The query is registered once, and results are output repeatedly.1

6 Real-time queries with Azure Stream Analytics

This chapter covers

Figure 6.1 Data stream with one-at-a-time and micro-batch queries

6.1 Creating a Stream Analytics service

6.1.1 Elements of a Stream Analytics job

6.1.2 Create an ASA job using the Azure portal

6.1.3 Create an ASA job using Azure PowerShell

6.2 Configuring inputs and outputs

6.2.1 Event Hub job input

6.2.2 ASA job outputs

6.3 Creating a job query