Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Professional-Machine-Learning-Engineer Topic 9 Question 72 Discussion

Actual exam question for Google's Google Professional Machine Learning Engineer exam
Question #: 72
Topic #: 9
[All Google Professional Machine Learning Engineer Questions]

You are developing a model to help your company create more targeted online advertising campaigns. You need to create a dataset that you will use to train the model. You want to avoid creating or reinforcing unfair bias in the model. What should you do?

Choose 2 answers

Show Suggested Answer Hide Answer
Suggested Answer: C, E

To avoid creating or reinforcing unfair bias in the model, you should collect a representative sample of production traffic to build the training dataset, and conduct fairness tests across sensitive categories and demographics on the trained model. A representative sample is one that reflects the true distribution of the population, and does not over- or under-represent any group. A random sample is a simple way to obtain a representative sample, as it ensures that every data point has an equal chance of being selected. A stratified sample is another way to obtain a representative sample, as it ensures that every subgroup has a proportional representation in the sample. However, a stratified sample requires prior knowledge of the subgroups and their sizes, which may not be available or easy to obtain. Therefore, a random sample is a more feasible option in this case. A fairness test is a way to measure and evaluate the potential bias and discrimination of the model, based on different categories and demographics, such as age, gender, race, etc. A fairness test can help you identify and mitigate any unfair outcomes or impacts of the model, and ensure that the model treats all groups fairly and equitably. A fairness test can be conducted using various methods and tools, such as confusion matrices, ROC curves, fairness indicators, etc.Reference: The answer can be verified from official Google Cloud documentation and resources related to data sampling and fairness testing.

Sampling data | BigQuery

Fairness Indicators | TensorFlow

What-if Tool | TensorFlow


Contribute your Thoughts:

Thersa
8 days ago
Haha, can you imagine if we just went with option B? 'Oh, here's our dataset for targeted advertising - it's just a bunch of middle-aged white dudes.' Yeah, no, that's not going to fly. I'm with you guys on the stratified sampling approach. Gotta keep that diversity in check, you know?
upvoted 0 times
...
Lettie
9 days ago
Hmm, I'm not too keen on option B. Focusing only on the groups that interact most with ads could lead to some serious skew in the data. And option A, with all the demographic features, just feels like a recipe for disaster. Fairness testing, as in option E, is definitely important, but we need to get the data right first.
upvoted 0 times
...
Vallie
10 days ago
You know, I was initially leaning towards option C, the random sample, but after thinking about it, I agree that a stratified sample is probably the better approach. That way, we can make sure we're not over-representing any one group and really getting a representative dataset to train the model on.
upvoted 0 times
...
Christoper
11 days ago
Ah, this is a tricky one. We definitely want to avoid creating or reinforcing unfair bias, but including a comprehensive set of demographic features could potentially amplify those biases. I'm thinking option D might be the way to go - collecting a stratified sample of production traffic could help ensure we capture a diverse range of users and perspectives.
upvoted 0 times
...

Save Cancel