Google Professional Data Engineer Exam - Topic 3 Question 114 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 114
Topic #: 3

[All Professional Data Engineer Questions]

You have a data pipeline with a Dataflow job that aggregates and writes time series metrics to Bigtable. You notice that data is slow to update in Bigtable. This data feeds a dashboard used by thousands of users across the organization. You need to support additional concurrent users and reduce the amount of time required to write the dat

a. What should you do?

Choose 2 answers

AConfigure your Dataflow pipeline to use local execution.

BModify your Dataflow pipeline lo use the Flatten transform before writing to Bigtable.

CModify your Dataflow pipeline to use the CoGrcupByKey transform before writing to Bigtable.

DIncrease the maximum number of Dataflow workers by setting maxNumWorkers in PipelineOptions.

EIncrease the number of nodes in the Bigtable cluster.

Show Suggested Answer

Suggested Answer: D, E

https://cloud.google.com/bigtable/docs/performance#performance-write-throughput

https://cloud.google.com/dataflow/docs/reference/pipeline-options

by Robt at Nov 22, 2025, 08:05 PM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Josefa

12 days ago

Definitely! D is a must for handling more users efficiently.

upvoted 0 times

...

Gracia

17 days ago

I feel like E is essential. More nodes mean better throughput.

upvoted 0 times

...

Rebbecca

22 days ago

I’m leaning towards B and C. Flattening might optimize writes.

upvoted 0 times

...

Chaya

27 days ago

Agreed! D will scale the pipeline, and E boosts Bigtable performance.

upvoted 0 times

...

Alberto

1 month ago

Definitely need to increase maxNumWorkers for better throughput!

upvoted 0 times

...

Maynard

1 month ago

Wait, why would local execution help? That sounds counterproductive.

upvoted 0 times

...

Claribel

2 months ago

I think using the CoGroupByKey transform is a good idea too.

upvoted 0 times

...

Latrice

2 months ago

Totally agree, more nodes means better handling of concurrent users!

upvoted 0 times

...

Edward

2 months ago

Increasing the number of nodes in the Bigtable cluster can help with performance.

upvoted 0 times

...

Ngoc

2 months ago

E) is a must, but I'd also try B) just for fun. Flatten that data!

upvoted 0 times

...

Willard

3 months ago

A) and D) - local execution and more workers? What is this, amateur hour?

upvoted 0 times

...

Vallie

3 months ago

I'd go with C) and E). CoGroupByKey and more Bigtable nodes will make this pipeline fly!

upvoted 0 times

...

Danica

3 months ago

D) and E) are the way to go. Increase the workers and the Bigtable cluster size.

upvoted 0 times

...

Denny

3 months ago

B) and E) should do the trick. Flatten and more Bigtable nodes should help with the slow updates.

upvoted 0 times

...

Jeff

3 months ago

I’m a bit confused about the Flatten transform; I thought it was more for restructuring data rather than improving write performance. Not sure if option B is the right call.

upvoted 0 times

...

Marge

3 months ago

I practiced a similar question about optimizing data pipelines, and I feel like increasing the Bigtable cluster nodes could definitely improve write speeds. So, option E seems plausible.

upvoted 0 times

...

Kenneth

4 months ago

I'm not entirely sure, but I think using the CoGroupByKey transform could help with aggregating data more efficiently. That might be option C?

upvoted 0 times

...

Regenia

4 months ago

I'm a bit confused by the Flatten transform option. That doesn't seem directly related to the performance problem. I'm going to go with increasing the Dataflow workers and the Bigtable cluster size - those seem like the most straightforward ways to scale up the system.

upvoted 0 times

...

Antonio

4 months ago

I remember something about increasing the number of workers in Dataflow to handle more concurrent users, so maybe option D is a good choice.

upvoted 0 times

...

Mollie

4 months ago

Okay, let's think this through. Configuring local execution for Dataflow probably won't help with the Bigtable performance issues. I'm leaning towards the Bigtable cluster size increase and using CoGroupByKey to optimize the data writes.

upvoted 0 times

...

Remona

5 months ago

But B could complicate the pipeline. D and E seem safer.

upvoted 0 times

...

Gabriele

5 months ago

I think D and E are the best options. More workers and nodes can really help.

upvoted 0 times

...

Angella

5 months ago

Hmm, the question is asking for two answers, so I'll need to pick carefully. I think the CoGroupByKey transform and increasing the Bigtable cluster size are the best options to try.

upvoted 0 times

...

Camellia

5 months ago

I'm not sure about the Flatten transform, but increasing the number of Dataflow workers and Bigtable nodes sounds like a good way to scale up the pipeline and handle more concurrent users.

upvoted 0 times