44 of 55. A data engineer is working on a real-time analytics pipeline using Spark Structured Streaming. They want the system to process incoming data in micro-batches at a fixed interval of 5 seconds.
Which code snippet fulfills this requirement?
A.
query = df.writeStream \
.outputMode("append") \
.trigger(processingTime="5 seconds") \
.start()
B.
query = df.writeStream \
.outputMode("append") \
.trigger(continuous="5 seconds") \
.start()
C.
query = df.writeStream \
.outputMode("append") \
.trigger(once=True) \
.start()
D.
query = df.writeStream \
.outputMode("append") \
.start()
To process data in fixed micro-batch intervals, use the .trigger(processingTime='interval') option in Structured Streaming.
Correct usage:
query = df.writeStream
.outputMode('append')
.trigger(processingTime='5 seconds')
.start()
This instructs Spark to process available data every 5 seconds.
Why the other options are incorrect:
B: continuous triggers are for continuous processing mode (different execution model).
C: once=True runs the stream a single time (batch mode).
D: Default trigger runs as fast as possible, not fixed intervals.
PySpark Structured Streaming Guide --- Trigger types: processingTime, once, continuous.
Databricks Exam Guide (June 2025): Section ''Structured Streaming'' --- controlling streaming triggers and batch intervals.
===========
Laura
1 day agoMagdalene
6 days agoFletcher
12 days agoAn
17 days agoHolley
22 days agoCathrine
27 days agoPearly
2 months agoChan
2 months agoJulian
2 months agoTarra
2 months agoReuben
2 months agoPamella
2 months agoGlory
3 months agoJoanna
3 months agoLeota
3 months agoTamra
3 months agoAmie
3 months agoBrittani
3 months agoEugene
4 months agoGlenna
4 months agoWilbert
4 months agoTricia
5 months agoLavonda
5 months agoDelsie
4 months agoKiley
4 months ago