New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon MLS-C01 Exam - Topic 2 Question 95 Discussion

Actual exam question for Amazon's MLS-C01 exam
Question #: 95
Topic #: 2
[All MLS-C01 Questions]

A machine learning specialist works for a fruit processing company and needs to build a system that

categorizes apples into three types. The specialist has collected a dataset that contains 150 images for each type of apple and applied transfer learning on a neural network that was pretrained on ImageNet with this dataset.

The company requires at least 85% accuracy to make use of the model.

After an exhaustive grid search, the optimal hyperparameters produced the following:

68% accuracy on the training set

67% accuracy on the validation set

What can the machine learning specialist do to improve the system's accuracy?

Show Suggested Answer Hide Answer
Suggested Answer: A

SageMaker Data Wrangler is a feature of SageMaker Studio that provides an end-to-end solution for importing, preparing, transforming, featurizing, and analyzing data. Data Wrangler includes built-in analyses that help generate visualizations and data insights in a few clicks. One of the built-in analyses is the Quick Model visualization, which can be used to quickly evaluate the data and produce importance scores for each feature. A feature importance score indicates how useful a feature is at predicting a target label. The feature importance score is between [0, 1] and a higher number indicates that the feature is more important to the whole dataset. The Quick Model visualization uses a random forest model to calculate the feature importance for each feature using the Gini importance method. This method measures the total reduction in node impurity (a measure of how well a node separates the classes) that is attributed to splitting on a particular feature. The ML developer can use the Quick Model visualization to obtain the importance scores for each feature of the dataset and use them to feature engineer the dataset. This solution requires the least development effort compared to the other options.

References:

* Analyze and Visualize

* Detect multicollinearity, target leakage, and feature correlation with Amazon SageMaker Data Wrangler


Contribute your Thoughts:

0/2000 characters
Phyliss
3 months ago
Not sure if just retraining with the same model will do much...
upvoted 0 times
...
Xenia
3 months ago
Definitely B, adding data should help reduce bias.
upvoted 0 times
...
Earleen
4 months ago
Wait, only 68% on training? That seems low for transfer learning.
upvoted 0 times
...
Mireya
4 months ago
I think option C could help too, more layers might capture better features.
upvoted 0 times
...
Ivette
4 months ago
More data is always a good idea!
upvoted 0 times
...
Jani
4 months ago
I’m a bit confused about the hyperparameter optimization part. Would using Amazon SageMaker really make a difference? Option A seems interesting but I’m not sure.
upvoted 0 times
...
Maryann
5 months ago
I feel like we practiced a question similar to this, and I think retraining with more data is often a solid approach. Option B sounds right to me.
upvoted 0 times
...
Xuan
5 months ago
I'm not entirely sure, but I think using a more complex model might help improve accuracy. Option C could be worth considering.
upvoted 0 times
...
Paola
5 months ago
I remember we discussed how adding more data can help reduce bias in models. So, option B seems like a good choice.
upvoted 0 times
...
Pamella
5 months ago
Alright, I've got this. The key is to increase the model's capacity without overfitting. Adding more data and optimizing the hyperparameters should do the trick. I'm feeling confident about this one.
upvoted 0 times
...
Tish
5 months ago
Okay, let's think this through. The model is already underfitting, so adding more layers or a more complex model might not be the best approach. I think focusing on the data and hyperparameters is the way to go.
upvoted 0 times
...
Jesus
5 months ago
Hmm, this is a tricky one. I'm not sure if I should try to optimize the hyperparameters or add more data. Maybe I'll try both and see what works better.
upvoted 0 times
...
Marcelle
5 months ago
I'm a bit confused here. The training and validation accuracies are pretty close, so I'm not sure if the issue is bias or variance. Maybe I should try different techniques and see what works best.
upvoted 0 times
...
Asha
5 months ago
Hmm, this looks like a networking question. I think the arp command might be the right answer since it shows the MAC addresses of devices on the local network.
upvoted 0 times
...
Albina
5 months ago
This seems pretty straightforward. I'm going to work through the math step-by-step to make sure I get the right answer. I'm confident I can solve this problem.
upvoted 0 times
...
Dortha
5 months ago
Okay, I've got this. The key is remembering that the highest-level document is the test policy, followed by the test strategy, then the project test plan, and finally the system test plan. I'm confident option A is the right answer.
upvoted 0 times
...
Kayleigh
5 months ago
I'm a bit confused here. Is a Backup Domain Controller the same as a Read-Only Domain Controller? I'll need to double-check the differences.
upvoted 0 times
...
Adelle
10 months ago
I'm leaning towards option D. Starting from scratch with a new model might be the best way to meet the 85% accuracy requirement. Can't hurt to try, right?
upvoted 0 times
Ben
9 months ago
Maybe combining both options D and B could be the best approach to try and meet the accuracy requirement.
upvoted 0 times
...
Hyman
9 months ago
I agree, more data could definitely make a difference in improving the model's performance.
upvoted 0 times
...
Malinda
9 months ago
I think option B could also be helpful. Adding more data to the training set might help reduce bias and improve accuracy.
upvoted 0 times
...
...
Lili
10 months ago
Ha! This is a tough one. Maybe we could try uploading the model to a SageMaker notebook and let the AI figure it out for us. Wouldn't that be a hoot?
upvoted 0 times
...
Oren
10 months ago
I disagree, I believe option C is the better choice. Using a more complex neural network model with more layers that is pretrained on ImageNet will increase the variance and potentially lead to higher accuracy.
upvoted 0 times
Ling
9 months ago
I think trying out both options could be the best approach.
upvoted 0 times
...
Natalie
9 months ago
But adding more data to the training set might also be beneficial.
upvoted 0 times
...
Carman
9 months ago
Option C is a good choice, it could help increase accuracy.
upvoted 0 times
...
...
Herman
11 months ago
I think option B is the way to go. Adding more data to the training set will help reduce the bias and improve the overall accuracy of the model.
upvoted 0 times
Alaine
9 months ago
C) Use a neural network model with more layers that are pretrained on ImageNet and apply transfer learning to increase the variance.
upvoted 0 times
...
Nohemi
9 months ago
A) Upload the model to an Amazon SageMaker notebook instance and use the Amazon SageMaker HPO feature to optimize the model's hyperparameters.
upvoted 0 times
...
Dylan
9 months ago
B) Add more data to the training set and retrain the model using transfer learning to reduce the bias.
upvoted 0 times
...
Refugia
10 months ago
A) Upload the model to an Amazon SageMaker notebook instance and use the Amazon SageMaker HPO feature to optimize the model's hyperparameters.
upvoted 0 times
...
Gracia
10 months ago
B) Add more data to the training set and retrain the model using transfer learning to reduce the bias.
upvoted 0 times
...
...
Kindra
11 months ago
But what about using Amazon SageMaker HPO feature to optimize hyperparameters?
upvoted 0 times
...
Huey
11 months ago
I agree with Ricki, adding more data can help reduce bias.
upvoted 0 times
...
Ricki
11 months ago
I think we should add more data to the training set.
upvoted 0 times
...

Save Cancel