Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Topic 2 Question 78 Discussion

Actual exam question for Databricks's Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 exam
Question #: 78
Topic #: 2
[All Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Questions]

Which of the following describes a shuffle?

Show Suggested Answer Hide Answer
Suggested Answer: C

A shuffle is a Spark operation that results from DataFrame.coalesce().

No. DataFrame.coalesce() does not result in a shuffle.

A shuffle is a process that allocates partitions to executors.

This is incorrect.

A shuffle is a process that is executed during a broadcast hash join.

No, broadcast hash joins avoid shuffles and yield performance benefits if at least one of the two tables is small in size (<= 10 MB by default). Broadcast hash joins can avoid shuffles because

instead of exchanging partitions between executors, they broadcast a small table to all executors that then perform the rest of the join operation locally.

A shuffle is a process that compares data across executors.

No, in a shuffle, data is compared across partitions, and not executors.

More info: Spark Repartition & Coalesce - Explained (https://bit.ly/32KF7zS)


Contribute your Thoughts:

Lili
1 months ago
A) A shuffle is a process that is executed during a broadcast hash join. Haha, that's a good one! I don't think that's the right answer, though.
upvoted 0 times
...
Carmela
1 months ago
B) A shuffle is a process that compares data across executors. Hmm, I'm not sure about this one. Sounds a bit too specific.
upvoted 0 times
Levi
22 days ago
E) A shuffle is a process that allocates partitions to executors.
upvoted 0 times
...
Micaela
1 months ago
C) A shuffle is a process that compares data across partitions.
upvoted 0 times
...
Reynalda
1 months ago
A) A shuffle is a process that is executed during a broadcast hash join.
upvoted 0 times
...
...
Annelle
2 months ago
E) A shuffle is a process that allocates partitions to executors. That makes sense, but I'm a bit unsure about it.
upvoted 0 times
Elouise
22 days ago
E) A shuffle is a process that allocates partitions to executors.
upvoted 0 times
...
Anjelica
24 days ago
C) A shuffle is a process that compares data across partitions.
upvoted 0 times
...
Amber
30 days ago
A) A shuffle is a process that is executed during a broadcast hash join.
upvoted 0 times
...
...
Lucille
2 months ago
D) A shuffle is a Spark operation that results from DataFrame.coalesce(). I think this is the right answer, but I'm not 100% sure.
upvoted 0 times
...
Lamonica
2 months ago
I agree with Dana, a shuffle is definitely about allocating partitions to executors.
upvoted 0 times
...
Gilma
2 months ago
C) A shuffle is a process that compares data across partitions. This sounds like the correct answer to me.
upvoted 0 times
Amie
1 months ago
I agree, a shuffle compares data across partitions.
upvoted 0 times
...
Lizbeth
2 months ago
I think C is the correct answer.
upvoted 0 times
...
...
Dana
2 months ago
I believe a shuffle is a process that allocates partitions to executors.
upvoted 0 times
...
Tashia
2 months ago
I think a shuffle is when data is compared across partitions.
upvoted 0 times
...

Save Cancel