9 Integrating with Azure Data Lake Analytics

 

This chapter covers:

  • Using Azure Cognitive services to enhance data
  • Building user-defined functions using Visual Studio and C#
  • Connecting to remote data sources

In the last chapter, you learned how to use features of Azure Data Lake Analytics (ADLA) to build reusable objects to improve your U-SQL scripts. You also used C# to enhance, and sometimes replace, the functions of SQL. In this chapter, you’ll build on that foundation of reuse and extension by adding features to improve your U-SQL scripts. You’ll use the Data Lake store to serve assembly files for use in ADLA jobs. You’ll run Azure PowerShell and U-SQL scripts to modify the ADLA and Data Lake environments. You’ll add new types of data extraction classes to ADLA, and add C# functions to modify data in ADLA jobs. You’ll also connect to external providers to add even more data with minimal effort. This extensible nature of U-SQL is facilitated by the compiled nature of ADLA jobs.

9.1  Processing unstructured data

9.1.1  Azure Cognitive services

9.1.2  Managing assemblies in the Data Lake

9.1.3  Image data extraction with Advanced Analytics

9.2  Reading different file types

9.2.1  Adding custom libraries with a Catalog

9.2.2  Creating a catalog database

9.2.3  Building the U-SQL DataFormats solution

9.2.4  Code folders

9.2.5  Using custom assemblies

9.3  Connecting to remote sources

9.3.1  External databases

9.3.2  Credentials

9.3.3  Data Source

9.3.4  Tables and views

9.4  Exercises

9.4.1  Exercise 1

9.4.2  Exercise 2

9.5  Summary