Part 2 Ingesting, transforming, and storing data
Now that you have completed your first data engineering pipeline, you’re ready to explore the more advanced aspects of Snowflake data engineering.
In chapter 3, you’ll learn how to ingest data from a cloud storage provider and create external stages in Snowflake. We will explain and compare different approaches to ingesting files and offer tips on preparing data files in cloud storage for efficient ingestion.
In chapter 4, you’ll ingest semistructured data in JSON format and flatten it into a relational structure. You will add exception handling and logging to the data pipeline to ensure its resilience against unintended errors.
In chapter 5, you’ll build a new data pipeline that continuously ingests data from files as soon as they appear in the external cloud storage with minimum delay. We will introduce Snowflake features like Snowpipe for continuous data ingestion and dynamic tables for continuous data transformation.
Chapter 6 covers Snowpark, which consists of libraries and code execution environments that allow Python and other programming languages to run natively in Snowflake.
In chapter 7, you’ll immerse yourself in generative AI and large language models (LLMs). You will learn how to call external API endpoints from Snowflake and use Snowflake’s own Cortex LLM functions to enhance your data pipelines.