What happens to the underlying table data when a CLUSTER BY clause is added to a Snowflake table?
When aCLUSTER BYclause is added to a Snowflake table, it specifies one or more columns to organize the data within the table's micro-partitions. This clustering aims to colocate data with similar values in the same or adjacent micro-partitions. By doing so, it enhances the efficiency of query pruning, where the Snowflake query optimizer can skip over irrelevant micro-partitions that do not contain the data relevant to the query, thereby improving performance.
References:
Snowflake Documentation on Clustering Keys & Clustered Tables1.
Community discussions on how source data's ordering affects a table with a cluster key
Lenna
5 months agoGlenn
5 months agoNoe
6 months agoLili
6 months agoCeleste
6 months agoFausto
6 months agoRyan
6 months agoKiley
7 months agoBilly
7 months agoPearlie
7 months agoKarl
7 months agoDanica
7 months agoEmeline
8 months agoOzell
8 months agoTawna
8 months agoBrandee
8 months agoEdgar
8 months agoQuinn
4 months agoElli
5 months agoLeota
5 months agoCordelia
5 months agoBettye
9 months agoVincenza
10 months ago