44 of 55. A data engineer is working on a real-time analytics pipeline using Spark Structured Streaming. They want the system to process incoming data in micro-batches at a fixed interval of 5 seconds.
Which code snippet fulfills this requirement?
A.
query = df.writeStream \
.outputMode("append") \
.trigger(processingTime="5 seconds") \
.start()
B.
query = df.writeStream \
.outputMode("append") \
.trigger(continuous="5 seconds") \
.start()
C.
query = df.writeStream \
.outputMode("append") \
.trigger(once=True) \
.start()
D.
query = df.writeStream \
.outputMode("append") \
.start()
To process data in fixed micro-batch intervals, use the .trigger(processingTime='interval') option in Structured Streaming.
Correct usage:
query = df.writeStream
.outputMode('append')
.trigger(processingTime='5 seconds')
.start()
This instructs Spark to process available data every 5 seconds.
Why the other options are incorrect:
B: continuous triggers are for continuous processing mode (different execution model).
C: once=True runs the stream a single time (batch mode).
D: Default trigger runs as fast as possible, not fixed intervals.
PySpark Structured Streaming Guide --- Trigger types: processingTime, once, continuous.
Databricks Exam Guide (June 2025): Section ''Structured Streaming'' --- controlling streaming triggers and batch intervals.
===========
Pearly
9 hours agoChan
6 days agoJulian
11 days agoTarra
16 days agoReuben
21 days agoPamella
26 days agoGlory
1 month agoJoanna
1 month agoLeota
1 month agoTamra
2 months agoAmie
2 months agoBrittani
2 months agoEugene
2 months agoGlenna
2 months agoWilbert
2 months agoTricia
3 months agoLavonda
3 months agoDelsie
3 months agoKiley
3 months ago