Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Machine Learning Associate Exam - Topic 2 Question 45 Discussion

In which of the following situations is it preferable to impute missing feature values with their median value over the mean value?
C) When the features contain a lot of extreme outliers
A) When the features are of the categorical type
B) When the features are of the boolean type
D) When the features contain no outliers
E) When the features contain no missing no values

Databricks Machine Learning Associate Exam - Topic 2 Question 45 Discussion

Actual exam question for Databricks's Databricks Machine Learning Associate exam
Question #: 45
Topic #: 2
[All Databricks Machine Learning Associate Questions]

In which of the following situations is it preferable to impute missing feature values with their median value over the mean value?

Show Suggested Answer Hide Answer
Suggested Answer: C

Imputing missing values with the median is often preferred over the mean in scenarios where the data contains a lot of extreme outliers. The median is a more robust measure of central tendency in such cases, as it is not as heavily influenced by outliers as the mean. Using the median ensures that the imputed values are more representative of the typical data point, thus preserving the integrity of the dataset's distribution. The other options are not specifically relevant to the question of handling outliers in numerical data. Reference:

Data Imputation Techniques (Dealing with Outliers).


Contribute your Thoughts:

0/2000 characters

Currently there are no comments in this discussion, be the first to comment!


Save Cancel