10 Service integration with Azure Data Factory

 

This chapter covers

  • Building a single-step processing pipeline
  • Using a secret key store
  • Scheduling batch data processing

In previous chapters, you’ve learned how to use Azure services to ingest and transform data. Except for Stream Analytics (SA), which automatically processes incoming data, you have added the data or triggered a process manually. In this chapter, you’ll learn how to move data between services on a schedule. You’ll learn how to move files between Azure Storage accounts and your Data Lake store (ADLS store). You’ll also learn how to run U-SQL scripts on a schedule to transform data. You’ll use Azure Data Lake Analytics (ADLA) to read and transform data from multiple sources. You’ll learn how to store secrets in Azure Key Vault (AKV). Azure Data Factory (ADF) provides the connections that power this automation.

ADF manages execution of tasks. These can be as simple as calling a web service endpoint, or as complicated as creating a new server cluster to run custom code and removing it once the code completes. Each task is a resource entity consisting of a JSON resource definition. Each resource is related to one or more other resources. Resources and relationships are defined as follows:

10.1 Creating an Azure Data Factory service

10.2 Secure authentication

10.2.1 Azure Active Directory integration

10.2.2 Azure Key Vault

10.3 Copying files with ADF

10.3.1 Creating a Files storage container

10.3.2 Adding secrets to AKV

10.3.3 Creating a Files storage linkedservice

10.3.4 Creating an ADLS linkedservice

10.3.5 Creating a pipeline and activity

10.3.6 Creating a scheduled trigger

10.4 Running an ADLA job

sitemap