chapter nine

9 Data access and security

This chapter covers

Appreciating how data from the data platform is consumed
Comparing cloud-native data warehouse offerings
Using cloud-native services for data access patterns for applications
Simplifying the machine learning lifecycle
Understanding the basics of a cloud security model

In this chapter we acknowledge that the primary reason for developing a data platform is to cost effectively and securely make data available to data consumers—at scale. While throughout this book we have assumed that your data platform will include a data warehouse to support users who access data via business intelligence (BI) tools or by running SQL queries directly, this isn’t the only way data will be accessed.

Increasingly, raw data in storage is also being accessed by users, especially data scientists. And, increasingly, applications want access to data in storage as well. The layered design we’ve discussed throughout this book makes it easy to support a variety of data consumers.

9.1 Different types of data consumers

9.2 Cloud data warehouses

9.2.1 AWS Redshift

9.2.2 Azure Synapse

9.2.3 Google BigQuery

9.2.4 Choosing the right data warehouse

9.3 Application data access

9.3.1 Cloud relational databases

9.3.2 Cloud key/value data stores

9.3.3 Full-text search services

9.3.4 In-memory cache

9.4 Machine learning on the data platform

9.4.1 Machine learning model lifecycle on a cloud data platform

9.4.2 ML cloud collaboration tools

9.5 Business intelligence and reporting tools