U.S. Independence Day Deal! Unlock 25% OFF Today – Limited-Time Offer - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 1 Question 15 Discussion

34 of 55.A data engineer is investigating a Spark cluster that is experiencing underutilization during scheduled batch jobs.After checking the Spark logs, they noticed that tasks are often getting killed due to timeout errors, and there are several warnings about insufficient resources in the logs.Which action should the engineer take to resolve the underutilization issue?
D) Increase the number of executor instances to handle more concurrent tasks.
A) Set the spark.network.timeout property to allow tasks more time to complete without being killed.
B) Increase the executor memory allocation in the Spark configuration.
C) Reduce the size of the data partitions to improve task scheduling.

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 1 Question 15 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.5 exam
Question #: 15
Topic #: 1
[All Databricks Certified Associate Developer for Apache Spark 3.5 Questions]

34 of 55.

A data engineer is investigating a Spark cluster that is experiencing underutilization during scheduled batch jobs.

After checking the Spark logs, they noticed that tasks are often getting killed due to timeout errors, and there are several warnings about insufficient resources in the logs.

Which action should the engineer take to resolve the underutilization issue?

Show Suggested Answer Hide Answer
Suggested Answer: D

Underutilization with timeout warnings often indicates insufficient parallelism --- meaning there aren't enough executors to process all tasks concurrently.

Solution:

Increase the number of executors to allow more parallel task execution and better resource utilization.

Example configuration:

--conf spark.executor.instances=8

This distributes the workload more effectively across cluster nodes and reduces idle time for pending tasks.

Why the other options are incorrect:

A: Extending timeouts hides the symptom, not the root cause (lack of executors).

B: More memory per executor won't fix scheduling bottlenecks.

C: Reducing partition size may increase overhead and does not fix resource imbalance.


Databricks Exam Guide (June 2025): Section ''Troubleshooting and Tuning Apache Spark DataFrame API Applications'' --- tuning executors and cluster utilization.

Spark Configuration --- executor instances and resource scaling.

===========

Contribute your Thoughts:

0/2000 characters
Kati
1 month ago
I remember reading about timeout issues in Spark, and I think adjusting the spark.network.timeout could help with that.
upvoted 0 times
...

Save Cancel