Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Associate Data Practitioner Topic 2 Question 15 Discussion

Actual exam question for Google's Associate Data Practitioner exam
Question #: 15
Topic #: 2
[All Associate Data Practitioner Questions]

You are designing a pipeline to process data files that arrive in Cloud Storage by 3:00 am each day. Data processing is performed in stages, where the output of one stage becomes the input of the next. Each stage takes a long time to run. Occasionally a stage fails, and you have to address

the problem. You need to ensure that the final output is generated as quickly as possible. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: D

Using Cloud Composer to design the processing pipeline as a Directed Acyclic Graph (DAG) is the most suitable approach because:

Fault tolerance: Cloud Composer (based on Apache Airflow) allows for handling failures at specific stages. You can clear the state of a failed task and rerun it without reprocessing the entire pipeline.

Stage-based processing: DAGs are ideal for workflows with interdependent stages where the output of one stage serves as input to the next.

Efficiency: This approach minimizes downtime and ensures that only failed stages are rerun, leading to faster final output generation.


Contribute your Thoughts:

Kristofer
6 days ago
Option B sounds like the way to go. Dataflow's ability to restart the pipeline after fixing errors seems like the most efficient approach.
upvoted 0 times
Lisandra
3 days ago
Option B sounds like the way to go. Dataflow's ability to restart the pipeline after fixing errors seems like the most efficient approach.
upvoted 0 times
...
...

Save Cancel