Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

CompTIA Exam DY0-001 Topic 5 Question 4 Discussion

Actual exam question for CompTIA's DY0-001 exam
Question #: 4
Topic #: 5
[All DY0-001 Questions]

A data scientist is developing a model to predict the outcome of a vote for a national mascot. The choice is between tigers and lions. The full data set represents feedback from individuals representing 17 professions and 12 different locations. The following rank aggregation represents 80% of the data set:

Which of the following is the most likely concern about the model's ability to predict the outcome of the vote?

Show Suggested Answer Hide Answer
Suggested Answer: D

The aggregated feedback covers only 80% of respondents, mostly from a few professions and locations, so the model hasn't ''seen'' the remaining 20% (and those underrepresented groups). Its performance on those unseen subsets (out-of-sample data) is therefore the primary concern for how well it will predict the actual vote.


Contribute your Thoughts:

Malcom
4 days ago
Ha! This is like a real-life version of the age-old debate: tigers vs. lions. I bet the data scientists are having a field day with this one.
upvoted 0 times
...
Tyisha
16 days ago
But what about in-sample data? Could that also be a concern for the model's prediction?
upvoted 0 times
...
Kaitlyn
20 days ago
I agree with Alease, using data outside the range may not accurately predict the outcome.
upvoted 0 times
...
Janine
21 days ago
I'm not sure, but I'd be worried about the potential for bias in the data. Tigers and lions are both pretty exciting mascots, but I wonder if certain regions or professions might have a preference.
upvoted 0 times
Alba
9 days ago
A: That's a good point. They might need to consider collecting more data from a wider range of sources to improve the model's accuracy.
upvoted 0 times
...
Talia
10 days ago
B: Yeah, it could be biased towards those specific groups. Maybe they should try to get more diverse data.
upvoted 0 times
...
Lashawnda
12 days ago
A: I think the concern might be that the model is only based on data from certain professions and locations.
upvoted 0 times
...
...
Shizue
23 days ago
Out-of-sample data seems like the most likely issue. The model is trained on only 80% of the data, so it might not accurately reflect the full population.
upvoted 0 times
Luisa
2 days ago
Yes, the model might not generalize well to the entire population with only 80% of the data.
upvoted 0 times
...
Lazaro
17 days ago
I agree, out-of-sample data could lead to inaccurate predictions.
upvoted 0 times
...
...
Alease
27 days ago
I think the concern could be extrapolated data.
upvoted 0 times
...
Leana
1 months ago
You know, I bet the model would be a lot more accurate if they just had a vote-off between a tiger and a lion mascot. That would give us the true pulse of the nation!
upvoted 0 times
Cordelia
3 days ago
B: Yeah, using data outside of what was collected could affect the model's accuracy.
upvoted 0 times
...
Maurine
19 days ago
A: I think the concern might be extrapolated data.
upvoted 0 times
...
...

Save Cancel