Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 3 Question 14 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.5 exam
Question #: 14
Topic #: 3
[All Databricks Certified Associate Developer for Apache Spark 3.5 Questions]

35 of 55.

A data engineer is building a Structured Streaming pipeline and wants it to recover from failures or intentional shutdowns by continuing where it left off.

How can this be achieved?

Show Suggested Answer Hide Answer
Suggested Answer: C

In Structured Streaming, checkpoints store state information (offsets, progress, and metadata) needed to resume a stream after a failure or restart.

Correct usage:

Set the checkpointLocation option when writing the streaming output:

streaming_df.writeStream

.format('delta')

.option('checkpointLocation', '/path/to/checkpoint/dir')

.start('/path/to/output')

Spark uses this checkpoint directory to recover progress automatically and maintain exactly-once semantics.

Why the other options are incorrect:

A/D: recoveryLocation is not a valid Spark configuration option.

B: Checkpointing must be configured in writeStream, not during readStream.


PySpark Structured Streaming Guide --- Checkpointing and recovery.

Databricks Exam Guide (June 2025): Section ''Structured Streaming'' --- explains checkpointing and fault-tolerant streaming recovery.

Contribute your Thoughts:

0/2000 characters

Currently there are no comments in this discussion, be the first to comment!


Save Cancel