Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Isaca AAIA Exam - Topic 3 Question 13 Discussion

Actual exam question for Isaca's AAIA exam
Question #: 13
Topic #: 3
[All AAIA Questions]

When converting data categories before training an AI model, which of the following scenarios represents the GREATEST risk?

Show Suggested Answer Hide Answer
Suggested Answer: C

The AAIA Study Guide emphasizes that encoding categorical variables must preserve the semantic meaning and order of categories when relevant. The greatest risk occurs when ordinal data---such as customer rewards tiers---is treated as nominal through one-hot encoding, which removes the inherent order and may impair model learning.

''Improper encoding of ordinal variables as nominal can distort the model's understanding of relationships, leading to inaccurate predictions or biased outcomes.''

Customer reward categories (economy < business < first class) have a natural order. One-hot encoding ignores this order, potentially degrading model accuracy. Other options represent nominal data and are appropriately encoded.


Contribute your Thoughts:

0/2000 characters
Paola
26 days ago
True, but C has more implications in customer behavior. It’s complex.
upvoted 0 times
...
Yvonne
1 month ago
But what about A? Car colors seem simple, right?
upvoted 0 times
...
Reita
1 month ago
I agree, especially with customer rewards. It can lead to overfitting.
upvoted 0 times
...
Emilio
1 month ago
I think option C is risky. Too many categories can confuse the model.
upvoted 0 times
...
Lai
2 months ago
Dummy variables for dog breeds? That’s a small set, not risky.
upvoted 0 times
...
Desire
2 months ago
Totally agree with Walton, C seems sketchy.
upvoted 0 times
...
Willis
2 months ago
Wait, why is one-hot encoding risky? Sounds fine to me.
upvoted 0 times
...
Walton
2 months ago
I think option C has the greatest risk, too many categories!
upvoted 0 times
...
Izetta
2 months ago
One-hot encoding can lead to high dimensionality issues.
upvoted 0 times
...
Tyisha
2 months ago
Haha, I bet the exam writer is a dog person. Definitely not going with B!
upvoted 0 times
...
Wade
3 months ago
Hmm, I'd say B is the most dangerous. Dog breeds have a lot of hidden biases.
upvoted 0 times
...
Gail
3 months ago
D is the way to go. Dummy variables for product flavors are the safest bet.
upvoted 0 times
...
Melissia
3 months ago
I agree, C is the riskiest. Encoding customer rewards could reveal sensitive information about individuals.
upvoted 0 times
...
Emogene
4 months ago
Option C is the greatest risk. One-hot encoding customer rewards categories could lead to data leakage and overfitting.
upvoted 0 times
...
Marsha
4 months ago
I'm a bit confused about the differences between one-hot encoding and dummy variables. I wonder if that affects which scenario is riskier.
upvoted 0 times
...
Donte
4 months ago
I practiced a similar question, and I feel like creating dummy variables for dog breeds could introduce bias if some breeds are underrepresented.
upvoted 0 times
...
Deja
4 months ago
I think it might be option C, since customer rewards categories could have a significant impact on model performance if not handled correctly.
upvoted 0 times
...
Ryan
4 months ago
I remember discussing how one-hot encoding can lead to a high-dimensional space, but I'm not sure which option has the greatest risk.
upvoted 0 times
...
Alethea
4 months ago
I'm leaning towards the product flavor one-hot encoding as the riskiest. With that many unique flavors, you could end up with a ton of sparse columns that the model might have trouble generalizing from.
upvoted 0 times
...
Filiberto
5 months ago
For this type of data prep question, I usually try to think about the cardinality of the categories. The customer rewards one seems like it could have the most unique values, so I'd go with that as the highest risk.
upvoted 0 times
...
Shalon
5 months ago
Hmm, I'm a bit confused on this one. I was thinking the dog breed dummy variables might be the riskiest since there could be a lot of different breeds. But I'm not totally confident in that.
upvoted 0 times
...
Renato
5 months ago
I'm not totally sure about this one, but I think the greatest risk would be one-hot encoding the customer rewards category. That seems like it could lead to a lot of sparse data and potential overfitting.
upvoted 0 times
Whitney
15 days ago
Yeah, especially with many categories.
upvoted 0 times
...
Vashti
20 days ago
I agree, one-hot encoding can create sparsity.
upvoted 0 times
...
...

Save Cancel