Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Machine Learning Engineer Exam - Topic 1 Question 96 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 96
Topic #: 1
[All Professional Machine Learning Engineer Questions]

You work for a gaming company that has millions of customers around the world. All games offer a chat feature that allows players to communicate with each other in real time. Messages can be typed in more than 20 languages and are translated in real time using the Cloud Translation API. You have been asked to build an ML system to moderate the chat in real time while assuring that the performance is uniform across the various languages and without changing the serving infrastructure.

You trained your first model using an in-house word2vec model for embedding the chat messages translated by the Cloud Translation API. However, the model has significant differences in performance across the different languages. How should you improve it?

Show Suggested Answer Hide Answer
Suggested Answer: B

Vertex AI batch prediction is the most appropriate and efficient way to apply a pre-trained model like TensorFlow's SavedModel to a large dataset, especially for batch processing.

The Vertex AI batch prediction job works by exporting your dataset (in this case, historical data from BigQuery) to a suitable format (like Avro or CSV) and then processing it in Cloud Storage where the model is stored.

Avro format is recommended for large datasets as it is highly efficient for data storage and is optimized for read/write operations in Google Cloud, which is why option B is correct.

Option A suggests using BigQuery ML for inference, but it does not support running arbitrary TensorFlow models directly within BigQuery ML. Hence, BigQuery ML is not a valid option for this particular task.

Option C (exporting to CSV) is a valid alternative but is less efficient compared to Avro in terms of performance.


Contribute your Thoughts:

0/2000 characters
Dorthy
6 days ago
Not sure about that, B could miss context from translations.
upvoted 0 times
...
Ciara
12 days ago
I think option B makes the most sense. Original language is key.
upvoted 0 times
...
Lorriane
17 days ago
I feel like removing moderation for certain languages is not a good idea. It could lead to a lot of issues. Definitely not option D for me!
upvoted 0 times
...
Rosendo
23 days ago
Replacing word2vec with something like GPT-3 sounds tempting, but I wonder if it would really solve the performance issue across all languages. Option C might be risky.
upvoted 0 times
...
Dustin
28 days ago
I'm not entirely sure, but I think adding a regularization term could help with performance consistency across languages. Maybe option A?
upvoted 0 times
...
Fabiola
1 month ago
I remember we discussed the importance of training models on original language data to capture nuances better. So, option B seems like a solid choice.
upvoted 0 times
...
Iola
1 month ago
Hmm, this is a tricky one. I think I'll try option A - adding a regularization term like Min-Diff to the loss function. That should help balance the performance across languages.Liam: I'm a bit confused by this question. I'm not sure if I fully understand the problem or the different options. Maybe I'll go with option B and try training the classifier on the original language messages instead of the translations.Olivia: Ooh, option C looks interesting - replacing the in-house word2vec with a more powerful model like GPT-3 or T5. That could really boost the performance across the board.Ethan: I'm a bit hesitant about option D - removing moderation for languages with high false positive rates. That doesn't seem like a great long-term solution. I think I'll try one of the other options that focuses on improving the model itself.
upvoted 0 times
...
Golda
1 month ago
I'm a bit hesitant about option D - removing moderation for languages with high false positive rates. That doesn't seem like a great long-term solution. I think I'll try one of the other options that focuses on improving the model itself.
upvoted 0 times
...
Alyssa
1 month ago
Ooh, option C looks interesting - replacing the in-house word2vec with a more powerful model like GPT-3 or T5. That could really boost the performance across the board.
upvoted 0 times
...
Stephen
1 month ago
I'm a bit confused by this question. I'm not sure if I fully understand the problem or the different options. Maybe I'll go with option B and try training the classifier on the original language messages instead of the translations.
upvoted 0 times
...
Leota
1 month ago
Hmm, this is a tricky one. I think I'll try option A - adding a regularization term like Min-Diff to the loss function. That should help balance the performance across languages.
upvoted 0 times
...
Alease
6 months ago
This chat moderation task reminds me of that old saying - 'lost in translation' takes on a whole new meaning when millions of players are involved!
upvoted 0 times
...
Nieves
6 months ago
Replace the in-house word2vec with GPT-3 or T5? Sounds like a job for Optimus Prime!
upvoted 0 times
...
Ardella
6 months ago
I wouldn't recommend removing moderation for languages with high false positive rates. That could lead to unchecked toxicity in those communities. Better to keep trying to improve the model.
upvoted 0 times
Levi
5 months ago
I agree, removing moderation for languages with high false positive rates is not a good idea. We should keep working on improving the model.
upvoted 0 times
...
Sabrina
5 months ago
B) Train a classifier using the chat messages in their original language.
upvoted 0 times
...
Helene
5 months ago
A) Add a regularization term such as the Min-Diff algorithm to the loss function.
upvoted 0 times
...
...
Rosalind
6 months ago
Training a classifier directly on the original language messages is an intriguing idea. That way the model can learn the nuances of each language natively without relying on translations.
upvoted 0 times
Vilma
6 months ago
C) Replace the in-house word2vec with GPT-3 or T5.
upvoted 0 times
...
Marleen
6 months ago
B) Train a classifier using the chat messages in their original language.
upvoted 0 times
...
Peggy
6 months ago
A) Add a regularization term such as the Min-Diff algorithm to the loss function.
upvoted 0 times
...
...
Judy
7 months ago
Regularizing the model with the Min-Diff algorithm sounds like a good approach to balance the performance across languages. Interesting that the in-house word2vec is struggling - maybe GPT-3 or T5 could provide better text representations.
upvoted 0 times
Cordell
5 months ago
C) Replace the in-house word2vec with GPT-3 or T5.
upvoted 0 times
...
Avery
5 months ago
B) Train a classifier using the chat messages in their original language.
upvoted 0 times
...
Cheryll
6 months ago
A) Add a regularization term such as the Min-Diff algorithm to the loss function.
upvoted 0 times
...
...
Tanja
7 months ago
But wouldn't replacing the in-house word2vec with GPT-3 or T5 be a better option?
upvoted 0 times
...
Olive
7 months ago
I agree with Kirk. It would help improve the performance across different languages.
upvoted 0 times
...
Kirk
7 months ago
I think we should train a classifier using the chat messages in their original language.
upvoted 0 times
...

Save Cancel