6 Real-time queries with Azure Stream Analytics

 

This chapter covers:

  • Creating a Stream Analytics service
  • Configuring inputs and outputs
  • Choosing the number of streaming units
  • Writing queries using window functions
  • Writing queries for parallel processing

In previous chapters you’ve seen examples of prep work for batch processing, loading files into storage, and saving groups of messages into files. Storage accounts, Data Lake, and Event Hubs set the base for building a batch processing analytics system in Azure. In this chapter, you’re going to see how these services support stream processing too.

Stream processing covers running an operation on individual pieces of data from an endless sequence, or on multiple pieces of data in a time-ordered sequence. These two approaches are called one-at-a-time or real-time stream processing, and micro-batch processing.

Figure 6.1. Data stream with one-at-a-time and micro-batch queries
Streaming data

Figure 6.1 shows two queries processing a stream of data. One query checks every new data item and returns an output for each match. The other query counts how many items were submitted during a repeating time frame. The data is organized by time. Data in files from Azure Storage and messages from ingestion services like Event Hubs can both feed into stream processors. Stream processors generate results in real time rather than on demand. The query is registered once, and results are output repeatedly.[9]

6.1  Creating a Stream Analytics service

6.1.1  Elements of a Stream Analytics job

6.1.2  Create an ASA job using the Azure portal

6.1.3  Create an ASA job using Azure PowerShell

6.2  Configuring inputs and outputs

6.2.1  Event Hub job input

6.2.2  ASA job outputs

6.3  Creating a job query

6.3.1  Starting the ASA job

6.3.2  Failure to start

6.3.3  Output exceptions

6.4  Writing job queries

6.4.1  Window functions

6.4.2  Machine learning functions

6.5  Managing performance

sitemap