Data storage is the core piece of a data platform around which everything else is built. The focus of this chapter is storage solutions and trade-offs. We’ll also introduce two Azure services that we will use and discuss how these integrate. Figure 2.1 recaps the high-level view from chapter 1, highlighting the component discussed in this chapter.
Figure 2.1 Storage is the core piece of a data platform around which everything else is built. Data gets ingested into the storage layer and is distributed from there. All workloads (data processing, analytics, and machine learning) access this layer.
Because data moves continuously in and out of the data platform, this chapter focuses on storage and the need to accommodate multiple storage solutions, both external and inside the data platform. We will sketch out the storage layer of our data platform, then stand up the corresponding Azure services.