Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional Machine Learning Engineer Topic 1 Question 98 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 98
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You trained a model on data stored in a Cloud Storage bucket. The model needs to be retrained frequently in Vertex AI Training using the latest data in the bucket. Data preprocessing is required prior to retraining. You want to build a simple and efficient near-real-time ML pipeline in Vertex AI that will preprocess the data when new data arrives in the bucket. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: B

Cloud Run can be triggered on new data arrivals, which makes it ideal for near-real-time processing. The function then initiates the Vertex AI Pipeline for preprocessing and storing features in Vertex AI Feature Store, aligning with the retraining needs. Cloud Scheduler (Option A) is suitable for scheduled jobs, not event-driven triggers. Dataflow (Option C) is better suited for batch processing or ETL rather than ML preprocessing pipelines.


Contribute your Thoughts:

Mike
19 days ago
I see the benefits of option C as well. Building a Dataflow pipeline to preprocess data and store features in BigQuery could be a solid choice.
upvoted 0 times
...
Doug
24 days ago
Option F: Hire a team of psychic interns to monitor the bucket and trigger the pipeline whenever they sense new data. It's the future of MLOps!
upvoted 0 times
...
Ernest
26 days ago
Option E: Write a Python script that uses a Ouija board to divine the latest data and automatically retrain the model. It's foolproof!
upvoted 0 times
Wynell
7 days ago
B) Create a Cloud Run function that is triggered when new data arrives in the bucket. The function initiates a Vertex AI Pipeline to preprocess the new data and store the processed features in Vertex AI Feature Store.
upvoted 0 times
...
Dalene
9 days ago
A) Create a pipeline using the Vertex AI SDK. Schedule the pipeline with Cloud Scheduler to preprocess the new data in the bucket. Store the processed features in Vertex AI Feature Store.
upvoted 0 times
...
...
Amie
29 days ago
Option A is a good backup plan, but it requires additional scheduling and coordination. Option B just seems like the most straightforward and elegant solution here.
upvoted 0 times
Joesph
13 days ago
Yeah, Option B seems like the most elegant way to build a near-real-time ML pipeline.
upvoted 0 times
...
Jolanda
14 days ago
I agree, Option B with Cloud Run function sounds like the most straightforward solution.
upvoted 0 times
...
Charlesetta
18 days ago
I think Option B is the way to go. It's simple and efficient.
upvoted 0 times
...
...
Cassie
1 months ago
I prefer option B. Using a Cloud Run function to trigger a Vertex AI Pipeline sounds more straightforward to me.
upvoted 0 times
...
Virgina
1 months ago
Hmm, Option D seems a bit outdated. Preprocessing the data before each retraining seems like a lot of manual work. I'd prefer a more automated solution like Option B.
upvoted 0 times
...
Cherrie
1 months ago
I'm not sure about Option C. Configuring a cron job to trigger a Dataflow pipeline seems a bit overkill for this use case. Why not just use Vertex AI's built-in capabilities?
upvoted 0 times
Eve
16 days ago
D) Use the Vertex AI SDK to preprocess the new data in the bucket prior to each model retraining. Store the processed features in BigQuery.
upvoted 0 times
...
Han
18 days ago
B) Create a Cloud Run function that is triggered when new data arrives in the bucket. The function initiates a Vertex AI Pipeline to preprocess the new data and store the processed features in Vertex AI Feature Store.
upvoted 0 times
...
Tammara
25 days ago
A) Create a pipeline using the Vertex AI SDK. Schedule the pipeline with Cloud Scheduler to preprocess the new data in the bucket. Store the processed features in Vertex AI Feature Store.
upvoted 0 times
...
...
Tula
1 months ago
I agree with Jackie. Storing the processed features in Vertex AI Feature Store seems like a good idea for efficiency.
upvoted 0 times
...
Nieves
1 months ago
Option B looks like the most efficient solution. Triggering a pipeline when new data arrives in the bucket is a great way to keep the model up-to-date in near-real-time.
upvoted 0 times
Javier
28 days ago
A) Create a pipeline using the Vertex AI SDK. Schedule the pipeline with Cloud Scheduler to preprocess the new data in the bucket. Store the processed features in Vertex AI Feature Store.
upvoted 0 times
...
Dallas
1 months ago
B) Create a Cloud Run function that is triggered when new data arrives in the bucket. The function initiates a Vertex AI Pipeline to preprocess the new data and store the processed features in Vertex AI Feature Store.
upvoted 0 times
...
...
Jackie
1 months ago
I think option A is the best choice. It allows us to create a pipeline using Vertex AI SDK and schedule it with Cloud Scheduler.
upvoted 0 times
...

Save Cancel