What happens to the underlying table data when a CLUSTER BY clause is added to a Snowflake table?
When aCLUSTER BYclause is added to a Snowflake table, it specifies one or more columns to organize the data within the table's micro-partitions. This clustering aims to colocate data with similar values in the same or adjacent micro-partitions. By doing so, it enhances the efficiency of query pruning, where the Snowflake query optimizer can skip over irrelevant micro-partitions that do not contain the data relevant to the query, thereby improving performance.
References:
Snowflake Documentation on Clustering Keys & Clustered Tables1.
Community discussions on how source data's ordering affects a table with a cluster key
Lenna
2 months agoGlenn
2 months agoNoe
2 months agoLili
3 months agoCeleste
3 months agoFausto
3 months agoRyan
3 months agoKiley
4 months agoBilly
4 months agoPearlie
4 months agoKarl
4 months agoDanica
4 months agoEmeline
5 months agoOzell
5 months agoTawna
5 months agoBrandee
5 months agoEdgar
5 months agoQuinn
1 month agoElli
2 months agoLeota
2 months agoCordelia
2 months agoBettye
6 months agoVincenza
7 months ago