New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Machine Learning Engineer Exam - Topic 6 Question 104 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 104
Topic #: 6
[All Professional Machine Learning Engineer Questions]

You manage a team of data scientists who use a cloud-based backend system to submit training jobs. This system has become very difficult to administer, and you want to use a managed service instead. The data scientists you work with use many different frameworks, including Keras, PyTorch, theano, scikit-learn, and custom libraries. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: A

The best option for using a managed service to submit training jobs with different frameworks is to use Vertex AI Training. Vertex AI Training is a fully managed service that allows you to train custom models on Google Cloud using any framework, such as TensorFlow, PyTorch, scikit-learn, XGBoost, etc. You can also use custom containers to run your own libraries and dependencies. Vertex AI Training handles the infrastructure provisioning, scaling, and monitoring for you, so you can focus on your model development and optimization. Vertex AI Training also integrates with other Vertex AI services, such as Vertex AI Pipelines, Vertex AI Experiments, and Vertex AI Prediction. The other options are not as suitable for using a managed service to submit training jobs with different frameworks, because:

Configuring Kubeflow to run on Google Kubernetes Engine and submit training jobs through TFJob would require more infrastructure maintenance, as Kubeflow is not a fully managed service, and you would have to provision and manage your own Kubernetes cluster. This would also incur more costs, as you would have to pay for the cluster resources, regardless of the training job usage. TFJob is also mainly designed for TensorFlow models, and might not support other frameworks as well as Vertex AI Training.

Creating a library of VM images on Compute Engine, and publishing these images on a centralized repository would require more development time and effort, as you would have to create and maintain different VM images for different frameworks and libraries. You would also have to manually configure and launch the VMs for each training job, and handle the scaling and monitoring yourself. This would not leverage the benefits of a managed service, such as Vertex AI Training.

Setting up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure would require more configuration and administration, as Slurm is not a native Google Cloud service, and you would have to install and manage it on your own VMs or clusters. Slurm is also a general-purpose workload manager, and might not have the same level of integration and optimization for ML frameworks and libraries as Vertex AI Training.Reference:

Vertex AI Training | Google Cloud

Kubeflow on Google Cloud | Google Cloud

TFJob for training TensorFlow models with Kubernetes | Kubeflow

Compute Engine | Google Cloud

Slurm Workload Manager


Contribute your Thoughts:

0/2000 characters
Carisa
2 months ago
VM images could lead to versioning issues, not ideal.
upvoted 0 times
...
Kenneth
2 months ago
Slurm seems overkill for just training jobs, right?
upvoted 0 times
...
Goldie
3 months ago
Wait, can Vertex AI really handle all those frameworks?
upvoted 0 times
...
Lazaro
3 months ago
Vertex AI Training supports multiple frameworks, sounds good!
upvoted 0 times
...
Rutha
3 months ago
I think Kubeflow is more flexible for custom setups.
upvoted 0 times
...
Lashunda
3 months ago
Slurm sounds familiar, but I can't recall if it's the best fit for our cloud setup. I thought it was more for on-premise clusters.
upvoted 0 times
...
Zena
4 months ago
I feel like using VM images on Compute Engine might be too manual and not really a managed service, but it could work if we need specific environments.
upvoted 0 times
...
Peggie
4 months ago
I remember practicing with Kubeflow, and it seemed powerful, but I wonder if setting it up would be too complex for our needs.
upvoted 0 times
...
Josephine
4 months ago
I think Vertex AI Training could be a good choice since it supports multiple frameworks, but I'm not entirely sure how well it integrates with custom libraries.
upvoted 0 times
...
Wendell
4 months ago
Slurm workload manager could be an interesting approach, but I'm not sure how well it would integrate with the cloud infrastructure. I'll need to look into that more.
upvoted 0 times
...
Rima
5 months ago
Creating VM images on Compute Engine might work, but it feels a bit more manual than the other options. I'll have to weigh the pros and cons carefully.
upvoted 0 times
...
Timothy
5 months ago
Kubeflow on GKE could be a solid choice, but I'm not super familiar with it. I'll need to do some reading on how to configure it properly.
upvoted 0 times
...
Charolette
5 months ago
Vertex AI Training seems like a good option since it can handle multiple frameworks. I'll need to research how to set that up and make sure it meets the team's needs.
upvoted 0 times
...
Tandra
5 months ago
Hmm, this looks like a tricky one. I'll need to carefully consider the different options and how they might work with the variety of frameworks the team uses.
upvoted 0 times
...
Daren
7 months ago
I prefer creating a library of VM images on Compute Engine for more control.
upvoted 0 times
...
Colby
7 months ago
But configuring Kubeflow on GKE could also be a good choice.
upvoted 0 times
...
Eun
7 months ago
I agree with Elouise, it's the most flexible option.
upvoted 0 times
...
Jutta
7 months ago
Option C seems like a lot of overhead. Do I really want to be the VM image librarian? Nah, Vertex AI is the way to go, for sure.
upvoted 0 times
...
Elouise
7 months ago
I think we should use Vertex AI Training for any framework.
upvoted 0 times
...
Joesph
7 months ago
Haha, Slurm? What is this, the 90s? I'm going with Vertex AI, keep it simple and let Google handle the heavy lifting.
upvoted 0 times
Staci
6 months ago
User 1: Haha, Slurm? What is this, the 90s?
upvoted 0 times
...
...
Kerry
7 months ago
I'd go with option B. Kubeflow can handle all those different frameworks, and it'll be easier to manage than a bunch of custom VM images.
upvoted 0 times
Kayleigh
7 months ago
I agree, Kubeflow seems like the best option to handle all those different frameworks.
upvoted 0 times
...
Izetta
7 months ago
B) Configure Kubeflow to run on Google Kubernetes Engine and submit training jobs through TFJob.
upvoted 0 times
...
Pauline
7 months ago
A) Use the Vertex AI Training to submit training jobs using any framework.
upvoted 0 times
...
...
Bettina
8 months ago
Vertex AI Training sounds like the way to go. I don't want to deal with the hassle of setting up Kubeflow or managing VMs. That's just too much work.
upvoted 0 times
Essie
6 months ago
Vertex AI Training is definitely the easiest option. No need to deal with setting up Kubeflow or managing VMs.
upvoted 0 times
...
Stephane
7 months ago
A) Use the Vertex AI Training to submit training jobs using any framework.
upvoted 0 times
...
...

Save Cancel