Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Data Engineer Exam - Topic 4 Question 60 Discussion

You are collecting loT sensor data from millions of devices across the world and storing the data in BigQuery. Your access pattern is based on recent data tittered by location_id and device_version with the following query:You want to optimize your queries for cost and performance. How should you structure your data?
C) Cluster table data by create_date location_id and device_version
A) Partition table data by create_date, location_id and device_version
B) Partition table data by create_date cluster table data by tocation_id and device_version
D) Cluster table data by create_date, partition by location and device_version

Google Professional Data Engineer Exam - Topic 4 Question 60 Discussion

Actual exam question for Google's Professional Data Engineer exam
Question #: 60
Topic #: 4
[All Professional Data Engineer Questions]

You are collecting loT sensor data from millions of devices across the world and storing the data in BigQuery. Your access pattern is based on recent data tittered by location_id and device_version with the following query:

You want to optimize your queries for cost and performance. How should you structure your data?

Show Suggested Answer Hide Answer
Suggested Answer: C

Contribute your Thoughts:

0/2000 characters
Jolanda
7 months ago
C doesn't seem efficient for large datasets, though.
upvoted 0 times
...
Sophia
7 months ago
B is definitely the way to go, it balances everything nicely!
upvoted 0 times
...
Sherron
7 months ago
Wait, can we really cluster by multiple fields like that?
upvoted 0 times
...
Adelaide
8 months ago
I disagree, A might be more straightforward.
upvoted 0 times
...
Mickie
8 months ago
B seems like the best option for optimizing both cost and performance.
upvoted 0 times
...
Kate
8 months ago
I believe clustering by location_id and device_version could help with query performance, but I’m not entirely sure if that’s the right choice for this scenario.
upvoted 0 times
...
Tarra
8 months ago
I’m a bit confused about whether to cluster or partition first. I feel like I’ve seen similar questions, but I can’t recall the best approach.
upvoted 0 times
...
Shawana
8 months ago
I think option B sounds familiar; it seems like a good way to optimize for both cost and performance based on our practice questions.
upvoted 0 times
...
Alexis
8 months ago
I remember we discussed partitioning and clustering in class, but I'm not sure if I should prioritize one over the other here.
upvoted 0 times
...
Murray
8 months ago
Okay, I think I've got this. If an individual end user can't access a business application, that's likely a P1 issue since it's impacting a user's ability to do their job. I'll go with A - Yes.
upvoted 0 times
...
Rozella
8 months ago
I'm pretty confident that option B is the right answer here. Modular inputs and HEC seem like the recommended way to ingest data on clustered indexers.
upvoted 0 times
...

Save Cancel