Google Associate Data Practitioner Exam - Topic 4 Question 8 Discussion

Actual exam question for Google's Associate Data Practitioner exam

Question #: 8
Topic #: 4

[All Associate Data Practitioner Questions]

You need to create a data pipeline that streams event information from applications in multiple Google Cloud regions into BigQuery for near real-time analysis. The data requires transformation before loading. You want to create the pipeline using a visual interface. What should you do?

APush event information to a Pub/Sub topic. Create a Dataflow job using the Dataflow job builder.

BPush event information to a Pub/Sub topic. Create a Cloud Run function to subscribe to the Pub/Sub topic, apply transformations, and insert the data into BigQuery.

CPush event information to a Pub/Sub topic. Create a BigQuery subscription in Pub/Sub.

DPush event information to Cloud Storage, and create an external table in BigQuery. Create a BigQuery scheduled job that executes once each day to apply transformations.

Show Suggested Answer

Suggested Answer: A

Pushing event information to a Pub/Sub topic and then creating a Dataflow job using the Dataflow job builder is the most suitable solution. The Dataflow job builder provides a visual interface to design pipelines, allowing you to define transformations and load data into BigQuery. This approach is ideal for streaming data pipelines that require near real-time transformations and analysis. It ensures scalability across multiple regions and integrates seamlessly with Pub/Sub for event ingestion and BigQuery for analysis.

by Juan at Mar 03, 2025, 11:31 AM

Limited Time Offer

25%

Off

Get Premium Associate Data Practitioner Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Giuseppe

6 days ago

I think B could work too, but it might be slower.

upvoted 0 times

...

Bong

12 days ago

Option A is the best choice for real-time processing!

upvoted 0 times

...

Tayna

17 days ago

I vaguely remember something about using Cloud Storage and external tables, but that seems more suited for batch processing rather than real-time analysis.

upvoted 0 times

...

Kattie

23 days ago

I feel like option A is a solid choice since Dataflow is designed for data processing, but I wonder if the visual interface part is covered there.

upvoted 0 times

...

Katheryn

28 days ago

I think option B sounds familiar because we practiced using Cloud Run for transformations, but I can't recall if it's the most efficient way to load data into BigQuery.

upvoted 0 times

...

Audrie

1 month ago

I remember we discussed using Pub/Sub for streaming data, but I'm not sure if Dataflow is the best choice for this specific scenario.

upvoted 0 times

...

Rusty

1 month ago

I like the simplicity of option C with the Pub/Sub to BigQuery subscription. That could be a good way to get the data loaded quickly without having to worry about the transformation step. I'll have to think through the pros and cons of that versus the other options.

upvoted 0 times

...

Ronnie

1 month ago

Option D with Cloud Storage and a scheduled BigQuery job seems like it could work, but I'm not sure if that fully meets the "near real-time" analysis requirement. I'd probably lean more towards A or B to get the data in faster.

upvoted 0 times

...

Cordell

1 month ago

Hmm, I'm a little unsure about this one. The requirement to use a visual interface makes me think option B with Cloud Run might be the way to go, but I'm not 100% confident that's the best approach. I'll need to review the details carefully.

upvoted 0 times

...

Sylvie

1 month ago

This looks like a pretty straightforward data pipeline setup. I think I'd go with option A - using Pub/Sub and Dataflow seems like the most direct way to get the data into BigQuery with the required transformations.

upvoted 0 times

...