8 Optimizing query performance

 

This chapter covers

  • Getting data from the Snowflake Marketplace
  • Performing analysis of geographical data
  • Viewing query performance using the query profile
  • Understanding Snowflake micro-partitions
  • Optimizing storage with clustering
  • Improving query performance with search optimization
  • General tips for improving query performance

Data engineers must ensure that their data pipelines perform well, especially when dealing with large amounts of data. They should write efficient SQL queries and be familiar with Snowflake optimization techniques to meet user performance requirements.

In this chapter, we will write queries using large amounts of data from the Snowflake Marketplace. We will use Snowflake’s query profile tool to understand the mechanics of query execution. We will learn about Snowflake’s units of storage, known as micro-partitions. We will apply clustering to underlying tables to improve query performance and identify queries that benefit from this type of optimization. We will also look at search optimization and describe the types of queries that benefit. Finally, we will learn how to identify queries that are candidates for optimization and share tips on improving query performance.

8.1 Getting data from the Snowflake Marketplace

8.2 Performing analysis of geographical data

8.2.1 Snowflake’s geography functions

8.2.2 Copying data from the shared database

8.2.3 Viewing query execution parameters using the query profile

8.3 Understanding Snowflake micro-partitions

8.3.1 A conceptual example of micro-partitions

8.3.2 Micro-partition pruning

8.4 Optimizing storage with clustering

8.4.1 Viewing clustering information

8.4.2 Adding clustering keys to a table

8.4.3 Monitoring the clustering process

8.4.4 Viewing improved query execution after clustering

8.5 Improving query performance with search optimization

8.5.1 Adding search optimization to a table

8.5.2 Reviewing query performance after adding search optimization

8.6 General tips for improving query performance