Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 3 Question 14 Discussion

35 of 55.A data engineer is building a Structured Streaming pipeline and wants it to recover from failures or intentional shutdowns by continuing where it left off.How can this be achieved?
C) By configuring the option checkpointLocation during writeStream.
A) By configuring the option recoveryLocation during SparkSession initialization.
B) By configuring the option checkpointLocation during readStream.
D) By configuring the option recoveryLocation during writeStream.

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 3 Question 14 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.5 exam
Question #: 14
Topic #: 3
[All Databricks Certified Associate Developer for Apache Spark 3.5 Questions]

35 of 55.

A data engineer is building a Structured Streaming pipeline and wants it to recover from failures or intentional shutdowns by continuing where it left off.

How can this be achieved?

Show Suggested Answer Hide Answer
Suggested Answer: C

In Structured Streaming, checkpoints store state information (offsets, progress, and metadata) needed to resume a stream after a failure or restart.

Correct usage:

Set the checkpointLocation option when writing the streaming output:

streaming_df.writeStream

.format('delta')

.option('checkpointLocation', '/path/to/checkpoint/dir')

.start('/path/to/output')

Spark uses this checkpoint directory to recover progress automatically and maintain exactly-once semantics.

Why the other options are incorrect:

A/D: recoveryLocation is not a valid Spark configuration option.

B: Checkpointing must be configured in writeStream, not during readStream.


PySpark Structured Streaming Guide --- Checkpointing and recovery.

Databricks Exam Guide (June 2025): Section ''Structured Streaming'' --- explains checkpointing and fault-tolerant streaming recovery.

Contribute your Thoughts:

0/2000 characters
Rozella
1 month ago
I practiced a similar question, and I believe option B is correct because it relates to the readStream configuration.
upvoted 0 times
...
Tomas
1 month ago
I'm not entirely sure, but I remember something about recoveryLocation being important. Maybe it's option A?
upvoted 0 times
...
Marg
1 month ago
I think it might be option C, configuring checkpointLocation during writeStream, since checkpoints are crucial for recovery.
upvoted 0 times
...

Save Cancel