New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Machine Learning Engineer Exam - Topic 1 Question 96 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 96
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.

You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?

Show Suggested Answer Hide Answer
Suggested Answer: B

Vertex AI batch prediction is the most appropriate and efficient way to apply a pre-trained model like TensorFlow's SavedModel to a large dataset, especially for batch processing.

The Vertex AI batch prediction job works by exporting your dataset (in this case, historical data from BigQuery) to a suitable format (like Avro or CSV) and then processing it in Cloud Storage where the model is stored.

Avro format is recommended for large datasets as it is highly efficient for data storage and is optimized for read/write operations in Google Cloud, which is why option B is correct.

Option A suggests using BigQuery ML for inference, but it does not support running arbitrary TensorFlow models directly within BigQuery ML. Hence, BigQuery ML is not a valid option for this particular task.

Option C (exporting to CSV) is a valid alternative but is less efficient compared to Avro in terms of performance.


Contribute your Thoughts:

0/2000 characters
Ruthann
3 months ago
A regularization term could help balance things out, worth a try!
upvoted 0 times
...
Dorinda
3 months ago
Wait, removing moderation for some languages? That seems risky!
upvoted 0 times
...
Desiree
3 months ago
C sounds interesting! GPT-3 could really enhance understanding.
upvoted 0 times
...
Dorthy
4 months ago
Not sure about that, B could miss context from translations.
upvoted 0 times
...
Ciara
4 months ago
I think option B makes the most sense. Original language is key.
upvoted 0 times
...
Lorriane
4 months ago
I feel like removing moderation for certain languages is not a good idea. It could lead to a lot of issues. Definitely not option D for me!
upvoted 0 times
...
Rosendo
4 months ago
Replacing word2vec with something like GPT-3 sounds tempting, but I wonder if it would really solve the performance issue across all languages. Option C might be risky.
upvoted 0 times
...
Dustin
4 months ago
I'm not entirely sure, but I think adding a regularization term could help with performance consistency across languages. Maybe option A?
upvoted 0 times
...
Fabiola
5 months ago
I remember we discussed the importance of training models on original language data to capture nuances better. So, option B seems like a solid choice.
upvoted 0 times
...
Iola
5 months ago
Hmm, this is a tricky one. I think I'll try option A - adding a regularization term like Min-Diff to the loss function. That should help balance the performance across languages.Liam: I'm a bit confused by this question. I'm not sure if I fully understand the problem or the different options. Maybe I'll go with option B and try training the classifier on the original language messages instead of the translations.Olivia: Ooh, option C looks interesting - replacing the in-house word2vec with a more powerful model like GPT-3 or T5. That could really boost the performance across the board.Ethan: I'm a bit hesitant about option D - removing moderation for languages with high false positive rates. That doesn't seem like a great long-term solution. I think I'll try one of the other options that focuses on improving the model itself.
upvoted 0 times
...
Golda
5 months ago
I'm a bit hesitant about option D - removing moderation for languages with high false positive rates. That doesn't seem like a great long-term solution. I think I'll try one of the other options that focuses on improving the model itself.
upvoted 0 times
...
Alyssa
5 months ago
Ooh, option C looks interesting - replacing the in-house word2vec with a more powerful model like GPT-3 or T5. That could really boost the performance across the board.
upvoted 0 times
...
Stephen
5 months ago
I'm a bit confused by this question. I'm not sure if I fully understand the problem or the different options. Maybe I'll go with option B and try training the classifier on the original language messages instead of the translations.
upvoted 0 times
...
Leota
5 months ago
Hmm, this is a tricky one. I think I'll try option A - adding a regularization term like Min-Diff to the loss function. That should help balance the performance across languages.
upvoted 0 times
...
Alease
10 months ago
This chat moderation task reminds me of that old saying - 'lost in translation' takes on a whole new meaning when millions of players are involved!
upvoted 0 times
...
Nieves
10 months ago
Replace the in-house word2vec with GPT-3 or T5? Sounds like a job for Optimus Prime!
upvoted 0 times
...
Ardella
10 months ago
I wouldn't recommend removing moderation for languages with high false positive rates. That could lead to unchecked toxicity in those communities. Better to keep trying to improve the model.
upvoted 0 times
Levi
9 months ago
I agree, removing moderation for languages with high false positive rates is not a good idea. We should keep working on improving the model.
upvoted 0 times
...
Sabrina
9 months ago
B) Train a classifier using the chat messages in their original language.
upvoted 0 times
...
Helene
9 months ago
A) Add a regularization term such as the Min-Diff algorithm to the loss function.
upvoted 0 times
...
...
Rosalind
10 months ago
Training a classifier directly on the original language messages is an intriguing idea. That way the model can learn the nuances of each language natively without relying on translations.
upvoted 0 times
Vilma
9 months ago
C) Replace the in-house word2vec with GPT-3 or T5.
upvoted 0 times
...
Marleen
9 months ago
B) Train a classifier using the chat messages in their original language.
upvoted 0 times
...
Peggy
9 months ago
A) Add a regularization term such as the Min-Diff algorithm to the loss function.
upvoted 0 times
...
...
Judy
10 months ago
Regularizing the model with the Min-Diff algorithm sounds like a good approach to balance the performance across languages. Interesting that the in-house word2vec is struggling - maybe GPT-3 or T5 could provide better text representations.
upvoted 0 times
Cordell
9 months ago
C) Replace the in-house word2vec with GPT-3 or T5.
upvoted 0 times
...
Avery
9 months ago
B) Train a classifier using the chat messages in their original language.
upvoted 0 times
...
Cheryll
9 months ago
A) Add a regularization term such as the Min-Diff algorithm to the loss function.
upvoted 0 times
...
...
Tanja
11 months ago
But wouldn't replacing the in-house word2vec with GPT-3 or T5 be a better option?
upvoted 0 times
...
Olive
11 months ago
I agree with Kirk. It would help improve the performance across different languages.
upvoted 0 times
...
Kirk
11 months ago
I think we should train a classifier using the chat messages in their original language.
upvoted 0 times
...

Save Cancel