What happens to the underlying table data when a CLUSTER BY clause is added to a Snowflake table?
When aCLUSTER BYclause is added to a Snowflake table, it specifies one or more columns to organize the data within the table's micro-partitions. This clustering aims to colocate data with similar values in the same or adjacent micro-partitions. By doing so, it enhances the efficiency of query pruning, where the Snowflake query optimizer can skip over irrelevant micro-partitions that do not contain the data relevant to the query, thereby improving performance.
References:
Snowflake Documentation on Clustering Keys & Clustered Tables1.
Community discussions on how source data's ordering affects a table with a cluster key
Lenna
3 months agoGlenn
3 months agoNoe
4 months agoLili
4 months agoCeleste
4 months agoFausto
5 months agoRyan
5 months agoKiley
5 months agoBilly
5 months agoPearlie
6 months agoKarl
6 months agoDanica
6 months agoEmeline
6 months agoOzell
6 months agoTawna
7 months agoBrandee
7 months agoEdgar
7 months agoQuinn
3 months agoElli
3 months agoLeota
4 months agoCordelia
4 months agoBettye
8 months agoVincenza
8 months ago