This chapter covers:
- Building a single-step processing pipeline
- Using a secret key store
- Scheduling batch data processing
In previous chapters, you’ve learned how to use Azure services to ingest and transform data. Except for Stream Analytics (SA), which automatically processes incoming data, you have added the data manually or triggered a process manually. In this chapter, you’ll learn how to move data between services on a schedule. You’ll learn how to move files between Azure Storage accounts and your Azure Data Lake (ADL). You’ll also learn how to run U-SQL scripts on a schedule to transform data. You’ll use Azure Data Lake Analytics (ADLA) to read and transform data from multiple sources. You’ll learn how to store secrets in Azure Key Vault (AKV). Azure Data Factory (ADF) provides the connections that powers this automation.
ADF manages execution of tasks. These tasks can be as simple as calling a web service endpoint, or as complicated as creating a new server cluster to run custom code and removing it once the code completes. T Each task consists of a JSON definition. Tasks and relationships are defined as follows ( “A”):
- Each task is called an activity.
- Activities connect to external services using a linkedservice.
- One or more activities connect to form a pipeline.
- One or more pipelines form a data factory.