Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Professional Machine Learning Engineer Exam - Topic 9 Question 72 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam
Question #: 72
Topic #: 9
[All Professional Machine Learning Engineer Questions]

You are developing a model to help your company create more targeted online advertising campaigns. You need to create a dataset that you will use to train the model. You want to avoid creating or reinforcing unfair bias in the model. What should you do?

Choose 2 answers

Show Suggested Answer Hide Answer
Suggested Answer: C, E

To avoid creating or reinforcing unfair bias in the model, you should collect a representative sample of production traffic to build the training dataset, and conduct fairness tests across sensitive categories and demographics on the trained model. A representative sample is one that reflects the true distribution of the population, and does not over- or under-represent any group. A random sample is a simple way to obtain a representative sample, as it ensures that every data point has an equal chance of being selected. A stratified sample is another way to obtain a representative sample, as it ensures that every subgroup has a proportional representation in the sample. However, a stratified sample requires prior knowledge of the subgroups and their sizes, which may not be available or easy to obtain. Therefore, a random sample is a more feasible option in this case. A fairness test is a way to measure and evaluate the potential bias and discrimination of the model, based on different categories and demographics, such as age, gender, race, etc. A fairness test can help you identify and mitigate any unfair outcomes or impacts of the model, and ensure that the model treats all groups fairly and equitably. A fairness test can be conducted using various methods and tools, such as confusion matrices, ROC curves, fairness indicators, etc.Reference: The answer can be verified from official Google Cloud documentation and resources related to data sampling and fairness testing.

Sampling data | BigQuery

Fairness Indicators | TensorFlow

What-if Tool | TensorFlow


Contribute your Thoughts:

0/2000 characters
Lauran
4 months ago
Fairness tests sound great, but do they really help?
upvoted 0 times
...
Micaela
4 months ago
C is good, but D is better for fairness.
upvoted 0 times
...
Peggy
4 months ago
Wait, how does D work exactly?
upvoted 0 times
...
Rosio
4 months ago
Totally disagree with B, that's just biased!
upvoted 0 times
...
Dean
4 months ago
A is a must for comprehensive insights.
upvoted 0 times
...
Dean
5 months ago
I feel like conducting fairness tests is crucial, so E makes sense. But I'm torn between A and C for the other option.
upvoted 0 times
...
Genevive
5 months ago
I think we practiced a question similar to this, and I recall that stratified sampling helps ensure representation. So, D seems right, but I'm unsure about the second choice.
upvoted 0 times
...
Mendy
5 months ago
I'm not entirely sure, but I feel like including only the most frequent demographic groups could lead to bias. Maybe A and E are safer options?
upvoted 0 times
...
Royal
5 months ago
I remember we discussed the importance of avoiding bias in our training data, so I think options C and D might be the best choices.
upvoted 0 times
...
Abel
5 months ago
I'm a little unsure about this one. Should I include all the demographic features or just focus on the most common ones? And what exactly do they mean by "fairness tests"? I'll need to do some more research to make sure I'm approaching this the right way.
upvoted 0 times
...
Jill
5 months ago
This is a tough one, but I think the key is to focus on building a representative and unbiased dataset. Collecting a random or stratified sample seems like the way to go, and then testing for fairness is crucial. I feel pretty confident about this approach.
upvoted 0 times
...
Alaine
5 months ago
Okay, I think I've got a plan. I'll collect a stratified sample of production traffic to build the training dataset, and then I'll make sure to conduct fairness tests across sensitive categories and demographics on the trained model. That should help me avoid biases.
upvoted 0 times
...
Nichelle
5 months ago
Hmm, I'm a bit confused on this one. Should I include all the demographic features or just the most common ones? I don't want to reinforce any unfair biases, but I also want the model to be effective.
upvoted 0 times
...
Katheryn
5 months ago
This is a tricky one. I'll need to think carefully about how to avoid biases in the dataset. Collecting a random or stratified sample seems like a good approach, but I'm not sure if that's enough.
upvoted 0 times
...
Maybelle
5 months ago
I feel pretty confident about this question. The question clearly states that the organizations contract with the state to provide primary care and administrative services, so the answer has to be B, Primary care case managers (PCCMs).
upvoted 0 times
...
Dulce
6 months ago
Hmm, this seems like a straightforward question about multiplexing ODU signals. I'll need to think through the relationship between ODU2e, ODU2, and ODU3 to determine the correct answer.
upvoted 0 times
...
Bonita
6 months ago
Wait, what's the difference between open and closed questions again? I need to review my notes before answering this one.
upvoted 0 times
...
Lisandra
2 years ago
I think conducting fairness tests on the trained model is crucial to ensure we avoid bias.
upvoted 0 times
...
German
2 years ago
But wouldn't including a comprehensive set of demographic features help us address bias?
upvoted 0 times
...
Emiko
2 years ago
I disagree, I believe we should collect a stratified sample to ensure balanced representation.
upvoted 0 times
...
German
2 years ago
I think we should collect a random sample of production traffic for the dataset.
upvoted 0 times
...
Thersa
2 years ago
Haha, can you imagine if we just went with option B? 'Oh, here's our dataset for targeted advertising - it's just a bunch of middle-aged white dudes.' Yeah, no, that's not going to fly. I'm with you guys on the stratified sampling approach. Gotta keep that diversity in check, you know?
upvoted 0 times
...
Lettie
2 years ago
Hmm, I'm not too keen on option B. Focusing only on the groups that interact most with ads could lead to some serious skew in the data. And option A, with all the demographic features, just feels like a recipe for disaster. Fairness testing, as in option E, is definitely important, but we need to get the data right first.
upvoted 0 times
...
Vallie
2 years ago
You know, I was initially leaning towards option C, the random sample, but after thinking about it, I agree that a stratified sample is probably the better approach. That way, we can make sure we're not over-representing any one group and really getting a representative dataset to train the model on.
upvoted 0 times
...
Christoper
2 years ago
Ah, this is a tricky one. We definitely want to avoid creating or reinforcing unfair bias, but including a comprehensive set of demographic features could potentially amplify those biases. I'm thinking option D might be the way to go - collecting a stratified sample of production traffic could help ensure we capture a diverse range of users and perspectives.
upvoted 0 times
Fletcher
2 years ago
D) Collect a stratified sample of production traffic to build the training dataset.
upvoted 0 times
...
Kayleigh
2 years ago
A) Include a comprehensive set of demographic features.
upvoted 0 times
...
...

Save Cancel