Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon Exam MLS-C01 Topic 2 Question 95 Discussion

Actual exam question for Amazon's MLS-C01 exam
Question #: 95
Topic #: 2
[All MLS-C01 Questions]

A machine learning specialist works for a fruit processing company and needs to build a system that

categorizes apples into three types. The specialist has collected a dataset that contains 150 images for each type of apple and applied transfer learning on a neural network that was pretrained on ImageNet with this dataset.

The company requires at least 85% accuracy to make use of the model.

After an exhaustive grid search, the optimal hyperparameters produced the following:

68% accuracy on the training set

67% accuracy on the validation set

What can the machine learning specialist do to improve the system's accuracy?

Show Suggested Answer Hide Answer
Suggested Answer: A

SageMaker Data Wrangler is a feature of SageMaker Studio that provides an end-to-end solution for importing, preparing, transforming, featurizing, and analyzing data. Data Wrangler includes built-in analyses that help generate visualizations and data insights in a few clicks. One of the built-in analyses is the Quick Model visualization, which can be used to quickly evaluate the data and produce importance scores for each feature. A feature importance score indicates how useful a feature is at predicting a target label. The feature importance score is between [0, 1] and a higher number indicates that the feature is more important to the whole dataset. The Quick Model visualization uses a random forest model to calculate the feature importance for each feature using the Gini importance method. This method measures the total reduction in node impurity (a measure of how well a node separates the classes) that is attributed to splitting on a particular feature. The ML developer can use the Quick Model visualization to obtain the importance scores for each feature of the dataset and use them to feature engineer the dataset. This solution requires the least development effort compared to the other options.

References:

* Analyze and Visualize

* Detect multicollinearity, target leakage, and feature correlation with Amazon SageMaker Data Wrangler


Contribute your Thoughts:

Adelle
25 days ago
I'm leaning towards option D. Starting from scratch with a new model might be the best way to meet the 85% accuracy requirement. Can't hurt to try, right?
upvoted 0 times
...
Lili
26 days ago
Ha! This is a tough one. Maybe we could try uploading the model to a SageMaker notebook and let the AI figure it out for us. Wouldn't that be a hoot?
upvoted 0 times
...
Oren
29 days ago
I disagree, I believe option C is the better choice. Using a more complex neural network model with more layers that is pretrained on ImageNet will increase the variance and potentially lead to higher accuracy.
upvoted 0 times
Carman
4 days ago
Option C is a good choice, it could help increase accuracy.
upvoted 0 times
...
...
Herman
2 months ago
I think option B is the way to go. Adding more data to the training set will help reduce the bias and improve the overall accuracy of the model.
upvoted 0 times
Refugia
24 days ago
A) Upload the model to an Amazon SageMaker notebook instance and use the Amazon SageMaker HPO feature to optimize the model's hyperparameters.
upvoted 0 times
...
Gracia
1 months ago
B) Add more data to the training set and retrain the model using transfer learning to reduce the bias.
upvoted 0 times
...
...
Kindra
2 months ago
But what about using Amazon SageMaker HPO feature to optimize hyperparameters?
upvoted 0 times
...
Huey
2 months ago
I agree with Ricki, adding more data can help reduce bias.
upvoted 0 times
...
Ricki
2 months ago
I think we should add more data to the training set.
upvoted 0 times
...

Save Cancel