6 Real-time queries with Azure Stream Analytics

 

This chapter covers:

  • Creating a Stream Analytics service
  • Configuring inputs and outputs
  • Choosing the number of Streaming Units
  • Writing queries using window functions
  • Writing queries for parallel processing

In previous chapters you’ve seen examples of prep work for batch processing, loading files into storage, and saving groups of messages into files. Azure Storage accounts, Data Lake, and Event Hubs services set the base for building a batch processing analytics system in Azure. In this chapter, you’re going to see how these services support stream processing too.

Stream processing covers the execution of an operation on a piece of data in an endless sequence. Stream processing also covers executions on multiple pieces of data in a time-ordered sequence. These two approaches to data processing are called one-at-a-time or real-time stream processing, and micro-batch processing.

Figure 6.1. Data stream with one-at-a-time and micro-batch queries
Streaming data

Figure 6.1 shows two queries processing a stream of data. One query checks every new data item and returns an output for each match. The other query counts the number of data items submitted during a repeating time frame. The data is organized by time. Data files from Azure Storage and messages from ingestion services like Event Hubs can both feed into stream processors. In this chapter, you’ll learn about a new Azure service, Stream Analytics.

6.1  Creating a Stream Analytics service

6.1.1  Elements of a Stream Analytics job

6.1.2  Create an ASA job using the Azure Portal

6.1.3  Create an ASA job using Azure PowerShell

6.2  Configuring inputs and outputs

6.2.1  Event Hub job input

6.2.2  ASA job outputs

6.3  Creating a job query

6.3.1  Starting the ASA job

6.3.2  Failure to start

6.3.3  Output exceptions

6.4  Writing job queries

6.4.1  Window functions

6.4.2  Machine learning functions

6.5  Managing performance

6.5.1  Streaming units

sitemap