Google Professional Data Engineer Exam - Topic 1 Question 112 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 112
Topic #: 1

[All Professional Data Engineer Questions]

A data scientist has created a BigQuery ML model and asks you to create an ML pipeline to serve predictions. You have a REST API application with the requirement to serve predictions for an individual user ID with latency under 100 milliseconds. You use the following query to generate predictions: SELECT predicted_label, user_id FROM ML.PREDICT (MODEL 'dataset.model', table user_features). How should you create the ML pipeline?

AAdd a WHERE clause to the query, and grant the BigQuery Data Viewer role to the application service account.

BCreate an Authorized View with the provided query. Share the dataset that contains the view with the application service account.

CCreate a Cloud Dataflow pipeline using BigQueryIO to read results from the query. Grant the Dataflow Worker role to the application service account.

DCreate a Cloud Dataflow pipeline using BigQueryIO to read predictions for all users from the query. Write the results to Cloud Bigtable using BigtableIO. Grant the Bigtable Reader role to the application service account so that the application can read predictions for individual users from Cloud Bigtable.

Show Suggested Answer

Suggested Answer: D

by Lashandra at Aug 14, 2025, 03:56 AM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Verlene

4 days ago

I practiced a similar question about using Cloud Dataflow, but I can't recall if it's the best option here since we need predictions for a single user.

upvoted 0 times

...

Katina

10 days ago

I think creating an Authorized View could be a good approach since it simplifies access control, but I’m not entirely sure how it impacts performance.

upvoted 0 times

...

Jesusa

15 days ago

I remember we discussed the importance of latency in serving predictions, but I'm not sure if adding a WHERE clause is enough to meet the 100 ms requirement.

upvoted 0 times

...

Jennifer

21 days ago

This is a tricky one. I'm leaning towards Option C with the Cloud Dataflow pipeline. That way, I can control the entire data processing flow and ensure the latency requirements are met. The Dataflow Worker role should give the application the necessary access.

upvoted 0 times

...

Melinda

26 days ago

Okay, I think I've got a strategy here. Option B with the Authorized View seems like the best approach. That way, I can grant the necessary permissions to the application service account and the query will be optimized for low latency predictions.

upvoted 0 times

...

Gail

1 month ago

Hmm, I'm a bit confused. The question mentions a REST API application, but it's not clear how that fits into the pipeline. I'll need to think through the different options and how they address the latency requirement.

upvoted 0 times

...

Frederic

1 month ago

This seems like a straightforward question, but I want to make sure I understand the requirements correctly. The key is to create a pipeline that can serve predictions for individual users with low latency.

upvoted 0 times

...

Precious

2 months ago

I think option D is the way to go. Using Cloud Dataflow to read predictions for all users and writing them to Cloud Bigtable will allow for efficient access to individual user predictions.

upvoted 0 times

...

Lashaun

2 months ago

I'm leaning towards option C. Creating a Cloud Dataflow pipeline using BigQueryIO seems like a scalable solution for serving predictions with low latency.

upvoted 0 times

...

Denny

2 months ago

I disagree, I believe option B is the best choice. Creating an Authorized View will provide the necessary access to the application service account.

upvoted 0 times

...

Leigha

3 months ago

You know, I was thinking the same thing as Avery. The Dataflow pipeline seems like the best way to ensure low latency and high performance, even with a large dataset. Option C is my pick.

upvoted 0 times

...

Avery

3 months ago

Hmm, I'm not sure about that. What if the dataset is really large? Won't that lead to performance issues? I'd go with Option C and use Dataflow to handle the reading and processing.

upvoted 0 times

Amina

2 months ago

I agree, using Dataflow for reading and processing will help with performance issues.

upvoted 0 times

...

Shawna

2 months ago

Option C seems like the best choice. Dataflow can handle large datasets efficiently.

upvoted 0 times

...

Ceola

3 months ago

I think we should go with option A. Adding a WHERE clause to the query seems like the most efficient way to serve predictions for individual user IDs.

upvoted 0 times

...

Marge

3 months ago

Option B is the way to go. Creating an Authorized View and sharing the dataset with the application service account is the most efficient and scalable solution here.

upvoted 0 times

Dalene

2 months ago

B) Create an Authorized View with the provided query. Share the dataset that contains the view with the application service account.

upvoted 0 times

...