New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Machine Learning Engineer Exam - Topic 11 Question 38 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 38
Topic #: 11
[All Professional Machine Learning Engineer Questions]

You want to rebuild your ML pipeline for structured data on Google Cloud. You are using PySpark to conduct data transformations at scale, but your pipelines are taking over 12 hours to run. To speed up development and pipeline run time, you want to use a serverless tool and SQL syntax. You have already moved your raw data into Cloud Storage. How should you build the pipeline on Google Cloud while meeting the speed and processing requirements?

Show Suggested Answer Hide Answer
Suggested Answer: B

Contribute your Thoughts:

0/2000 characters
Precious
4 months ago
Not sure about D, converting everything to SQL might be a hassle.
upvoted 0 times
...
Reuben
4 months ago
I think C is a bit convoluted, just stick with BigQuery (D).
upvoted 0 times
...
Elly
4 months ago
I’m surprised no one mentioned using Dataproc (B) for scalability!
upvoted 0 times
...
Denny
4 months ago
I disagree, Data Fusion (A) could be easier for visualizing the pipeline.
upvoted 0 times
...
Krystal
5 months ago
Option D seems the fastest, BigQuery is super efficient!
upvoted 0 times
...
Asha
5 months ago
I practiced a similar question about using BigQuery for transformations, and it seems like option D could be the most efficient.
upvoted 0 times
...
Maryann
5 months ago
I think converting PySpark to SparkSQL might help, but running it on Dataproc could still be slow.
upvoted 0 times
...
Mitsue
5 months ago
I remember we discussed using Data Fusion for building pipelines, but I'm not sure if it's the best choice for speed.
upvoted 0 times
...
Gussie
5 months ago
I'm a bit confused about the federated queries in option C; I don't recall how they work with Cloud SQL and BigQuery together.
upvoted 0 times
...
Kallie
5 months ago
Okay, let's see here. The users are complaining about not having the correct new group memberships, so I'm thinking we need to do something to fix that. Maybe reordering the directories or disabling one of them could help?
upvoted 0 times
...
Kara
5 months ago
This is a tricky one. I'll need to re-read the prompt a few times and really think through each option to make sure I select the correct answer.
upvoted 0 times
...
Selma
5 months ago
I'm a little confused on the differences between predictive and prescriptive analytics. I'll have to review those concepts before answering this.
upvoted 0 times
...
Derick
5 months ago
I'm confident I can solve this. The average inventory is the midpoint between the maximum and minimum inventory levels, which is the EOQ plus half the safety stock.
upvoted 0 times
...
Dominga
5 months ago
The MAX and MIN functions work on character data types, so that's an interesting one. I'll make sure to remember that.
upvoted 0 times
...
Mammie
5 months ago
The term "westbound" rings a bell, but I struggle to remember how it aligns with the context of this question.
upvoted 0 times
...
Frederica
5 months ago
Vishing attacks target the user instead of the network itself, so it doesn't directly compromise confidentiality. I'm leaning towards packet sniffing for this one, but I'm still a bit uncertain.
upvoted 0 times
...

Save Cancel