CertNexus Exam AIP-210 Topic 6 Question 21 Discussion

Actual exam question for CertNexus's AIP-210 exam

Question #: 21
Topic #: 6

Given a feature set with rows that contain missing continuous values, and assuming the data is normally distributed, what is the best way to fill in these missing features?

ADelete entire rows that contain any missing features.

BFill in missing features with random values for that feature in the training set.

CFill in missing features with the average of observed values for that feature in the entire dataset.

DDelete entire columns that contain any missing features.

Show Suggested Answer

Suggested Answer: B

A support-vector machine (SVM) is a supervised learning algorithm that can be used for classification or regression problems. An SVM tries to find an optimal hyperplane that separates the data into different categories or classes. However, sometimes the data is not linearly separable, meaning there is no straight line or plane that can separate them. In such cases, a polynomial kernel can help improve the prediction of the SVM by transforming the data into a higher-dimensional space where it becomes linearly separable. A polynomial kernel is a function that computes the similarity between two data points using a polynomial function of their features.

by Coleen at Jun 23, 2024, 06:11 AM

Limited Time Offer

25%

Off

Get Premium AIP-210 Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Luisa

2 months ago

I've got a brilliant idea - why not just fill in the missing values with the average of the entire dataset, and then add a random number to it? That way, it'll be like a surprise every time!

upvoted 0 times

Ellsworth

10 days ago

User 3: I agree, adding random values might not be the best approach for filling in missing features in a normally distributed dataset.

upvoted 0 times

...

Rosendo

13 days ago

User 2: I think so too. It might be better to just fill in the missing values with the average of the entire dataset to maintain the distribution.

upvoted 0 times

...

Genevieve

1 months ago

User 1: That's an interesting idea, but wouldn't adding a random number introduce noise into the data?

upvoted 0 times

...

Tammara

2 months ago

Ah, the age-old dilemma of missing data. Deleting rows or columns seems a bit drastic, but I suppose if you're feeling brave, you could always roll the dice and see what happens.

upvoted 0 times

...

Monroe

2 months ago

B is a terrible idea. Filling in with random values? That's just asking for trouble. Might as well flip a coin while you're at it.

upvoted 0 times

Gilma

1 months ago

C) Fill in missing features with the average of observed values for that feature in the entire dataset.

upvoted 0 times

...

Susy

1 months ago

A) Delete entire rows that contain any missing features.

upvoted 0 times

...

Gerry

2 months ago

C) Fill in missing features with the average of observed values for that feature in the entire dataset.

upvoted 0 times

...

Mendy

2 months ago

D? Are you kidding me? Deleting entire columns with missing data is way too extreme. That's like throwing the baby out with the bathwater.

upvoted 0 times

...

Estrella

2 months ago

C is the way to go! Filling in with the average of observed values makes the most sense when dealing with a normal distribution.

upvoted 0 times

Rosann

1 months ago

I think deleting entire rows with missing features is too drastic, filling in with the average is more reasonable.

upvoted 0 times

...

Aide

1 months ago

I agree, filling in with the average of observed values is the best approach.

upvoted 0 times

...

Earnestine

2 months ago

I think filling in missing features with random values for that feature in the training set could introduce bias, so I would go with option C.

upvoted 0 times

...

Jesusita

2 months ago

I disagree, I believe we should delete entire rows that contain any missing features.

upvoted 0 times

...

Albina

3 months ago

I think we should fill in missing features with the average of observed values for that feature in the entire dataset.

upvoted 0 times

...