Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Machine Learning Engineer Exam - Topic 11 Question 55 Discussion

You work for a retailer that sells clothes to customers around the world. You have been tasked with ensuring that ML models are built in a secure manner. Specifically, you need to protect sensitive customer data that might be used in the models. You have identified four fields containing sensitive data that are being used by your data science team: AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. What should you do with the data before it is made available to the data science team for training purposes?
A) Tokenize all of the fields using hashed dummy values to replace the real values.
B) Use principal component analysis (PCA) to reduce the four sensitive fields to one PCA vector.
C) Coarsen the data by putting AGE into quantiles and rounding LATITUDE_LONGTTUDE into single precision. The other two fields are already as coarse as possible.
D) Remove all sensitive data fields, and ask the data science team to build their models using non-sensitive data.

Google Professional Machine Learning Engineer Exam - Topic 11 Question 55 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 55
Topic #: 11
[All Professional Machine Learning Engineer Questions]

You work for a retailer that sells clothes to customers around the world. You have been tasked with ensuring that ML models are built in a secure manner. Specifically, you need to protect sensitive customer data that might be used in the models. You have identified four fields containing sensitive data that are being used by your data science team: AGE, IS_EXISTING_CUSTOMER, LATITUDE_LONGITUDE, and SHIRT_SIZE. What should you do with the data before it is made available to the data science team for training purposes?

Show Suggested Answer Hide Answer
Suggested Answer: A

Contribute your Thoughts:

0/2000 characters
Mitsue
7 months ago
Definitely need to hash those values for security!
upvoted 0 times
...
Nilsa
7 months ago
Coarsening data could work, but is it enough?
upvoted 0 times
...
Lore
7 months ago
Wait, can PCA really protect sensitive info? Sounds risky.
upvoted 0 times
...
Chau
8 months ago
Disagree, removing all sensitive data seems too extreme!
upvoted 0 times
...
Shaniqua
8 months ago
I think tokenizing is the safest bet here.
upvoted 0 times
...
Iluminada
8 months ago
Removing all sensitive fields sounds extreme, but I guess it could be a safe option if we have enough non-sensitive data to work with.
upvoted 0 times
...
Adelle
8 months ago
Coarsening AGE and LATITUDE_LONGITUDE makes sense, but I'm not confident if it's enough to protect sensitive data.
upvoted 0 times
...
France
8 months ago
PCA seems like it could help, but I think it might lose important information about the original data.
upvoted 0 times
...
Fannie
8 months ago
I remember we discussed tokenization in class, but I'm not sure if hashing is the best approach for all these fields.
upvoted 0 times
...
Robt
8 months ago
Wait, I thought the expenditure approach also included things like transfer payments and money supply. This question is tripping me up a bit. I'll have to review my notes to make sure I understand this properly.
upvoted 0 times
...
Cherry
8 months ago
This question seems straightforward, I think I can handle it.
upvoted 0 times
...
Franklyn
8 months ago
Okay, let's think this through step-by-step. We need a virtual receptionist to connect callers, and a call queue to handle the help desk calls in FIFO order. I'm pretty confident those are the right resources to create.
upvoted 0 times
...

Save Cancel