Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Data Engineer Professional Exam - Topic 6 Question 46 Discussion

Actual exam question for Databricks's Databricks Certified Data Engineer Professional exam
Question #: 46
Topic #: 6
[All Databricks Certified Data Engineer Professional Questions]

Which statement describes Delta Lake Auto Compaction?

Show Suggested Answer Hide Answer
Suggested Answer: E

This is the correct answer because it describes the behavior of Delta Lake Auto Compaction, which is a feature that automatically optimizes the layout of Delta Lake tables by coalescing small files into larger ones. Auto Compaction runs as an asynchronous job after a write to a table has succeeded and checks if files within a partition can be further compacted. If yes, it runs an optimize job with a default target file size of 128 MB. Auto Compaction only compacts files that have not been compacted previously. Verified Reference: [Databricks Certified Data Engineer Professional], under ''Delta Lake'' section; Databricks Documentation, under ''Auto Compaction for Delta Lake on Databricks'' section.

'Auto compaction occurs after a write to a table has succeeded and runs synchronously on the cluster that has performed the write. Auto compaction only compacts files that haven't been compacted previously.'

https://learn.microsoft.com/en-us/azure/databricks/delta/tune-file-size


Contribute your Thoughts:

0/2000 characters
Amie
4 days ago
I think B is more accurate, optimize should run before cluster termination.
upvoted 0 times
...
Jacquelyne
9 days ago
A is correct, it mentions the 1 GB default for optimization.
upvoted 0 times
...
Lai
1 month ago
I don't think option D is correct because it mentions a messaging bus, which doesn't seem to fit with how Delta Lake handles data commits.
upvoted 0 times
...
Jordan
1 month ago
I feel like option C is related to how optimized writes work, but I’m confused about the difference between logical and directory partitions.
upvoted 0 times
...
Kirk
1 month ago
I remember practicing a question about how optimize works, but I can't recall if it runs before the cluster terminates or after the write completes.
upvoted 0 times
...
Truman
2 months ago
I think Delta Lake Auto Compaction involves some kind of asynchronous job, but I'm not sure if it's 1 GB or 128 MB for the optimize job.
upvoted 0 times
...

Save Cancel