New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Machine Learning Engineer Exam - Topic 10 Question 60 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 60
Topic #: 10
[All Professional Machine Learning Engineer Questions]

You have trained a DNN regressor with TensorFlow to predict housing prices using a set of predictive features. Your default precision is tf.float64, and you use a standard TensorFlow estimator;

estimator = tf.estimator.DNNRegressor(

feature_columns=[YOUR_LIST_OF_FEATURES],

hidden_units-[1024, 512, 256],

dropout=None)

Your model performs well, but Just before deploying it to production, you discover that your current serving latency is 10ms @ 90 percentile and you currently serve on CPUs. Your production requirements expect a model latency of 8ms @ 90 percentile. You are willing to accept a small decrease in performance in order to reach the latency requirement Therefore your plan is to improve latency while evaluating how much the model's prediction decreases. What should you first try to quickly lower the serving latency?

Show Suggested Answer Hide Answer
Suggested Answer: D

Applying quantization to your SavedModel by reducing the floating point precision can help reduce the serving latency by decreasing the amount of memory and computation required to make a prediction. TensorFlow provides tools such as the tf.quantization module that can be used to quantize models and reduce their precision, which can significantly reduce serving latency without a significant decrease in model performance.


Contribute your Thoughts:

0/2000 characters
Loreta
4 months ago
I agree, quantization seems like a quick fix!
upvoted 0 times
...
Olene
4 months ago
Definitely not A, that won't help in PREDICT mode.
upvoted 0 times
...
Emeline
4 months ago
Wait, increasing dropout to 0.8? Isn't that too high?
upvoted 0 times
...
Donette
5 months ago
I think quantization is the way to go, less precision might not hurt much.
upvoted 0 times
...
Rickie
5 months ago
Switching to GPU should help with latency!
upvoted 0 times
...
Paz
5 months ago
I’m a bit confused about the dropout adjustment. Wouldn't retraining the model after increasing it take too long?
upvoted 0 times
...
Rosita
5 months ago
I practiced a similar question where quantization was mentioned as a way to improve performance. It seems like a good approach here too.
upvoted 0 times
...
Dorothy
5 months ago
I think switching to GPU serving could be a quick fix for latency, but I’m not certain if it’s the best option for this scenario.
upvoted 0 times
...
Lino
5 months ago
I remember we discussed dropout rates in class, but I'm not sure increasing it to 0.8 would help with latency.
upvoted 0 times
...
Christiane
5 months ago
Hmm, I'm a bit unsure about this one. The question mentions the retry mechanism, but I'm not entirely clear on the specific cases where that would be relevant. I'll need to review the framework documentation more closely to make sure I understand the context before answering.
upvoted 0 times
...
Serina
5 months ago
This question seems pretty straightforward. I think I can handle this one.
upvoted 0 times
...
Angella
5 months ago
Okay, let me think this through. The project closure phase is all about wrapping things up and getting final sign-off. I'd say the customer acceptance document is the most important thing to review before finalizing the closure.
upvoted 0 times
...
Dyan
5 months ago
This seems straightforward. The question is asking about deploying Office to the enterprise and using group policy, so the obvious choice is Office 365 ProPlus. That's the enterprise-focused, subscription-based version that's designed for centralized management.
upvoted 0 times
...

Save Cancel