34 of 55.
A data engineer is investigating a Spark cluster that is experiencing underutilization during scheduled batch jobs.
After checking the Spark logs, they noticed that tasks are often getting killed due to timeout errors, and there are several warnings about insufficient resources in the logs.
Which action should the engineer take to resolve the underutilization issue?
Underutilization with timeout warnings often indicates insufficient parallelism --- meaning there aren't enough executors to process all tasks concurrently.
Solution:
Increase the number of executors to allow more parallel task execution and better resource utilization.
Example configuration:
--conf spark.executor.instances=8
This distributes the workload more effectively across cluster nodes and reduces idle time for pending tasks.
Why the other options are incorrect:
A: Extending timeouts hides the symptom, not the root cause (lack of executors).
B: More memory per executor won't fix scheduling bottlenecks.
C: Reducing partition size may increase overhead and does not fix resource imbalance.
Databricks Exam Guide (June 2025): Section ''Troubleshooting and Tuning Apache Spark DataFrame API Applications'' --- tuning executors and cluster utilization.
Spark Configuration --- executor instances and resource scaling.
===========
Currently there are no comments in this discussion, be the first to comment!