New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon MLS-C01 Exam - Topic 3 Question 96 Discussion

Actual exam question for Amazon's MLS-C01 exam
Question #: 96
Topic #: 3
[All MLS-C01 Questions]

A company operates large cranes at a busy port. The company plans to use machine learning (ML) for predictive maintenance of the cranes to avoid unexpected breakdowns and to improve productivity.

The company already uses sensor data from each crane to monitor the health of the cranes in real time. The sensor data includes rotation speed, tension, energy consumption, vibration, pressure, and ...perature for each crane. The company contracts AWS ML experts to implement an ML solution.

Which potential findings would indicate that an ML-based solution is suitable for this scenario? (Select TWO.)

Show Suggested Answer Hide Answer
Suggested Answer: A

Stratified sampling is a technique that preserves the class distribution of the original dataset when creating a smaller or split dataset. This means that the proportion of examples from each class in the original dataset is maintained in the smaller or split dataset. Stratified sampling can help improve the validation accuracy of the model by ensuring that the validation dataset is representative of the original dataset and not biased towards any class. This can reduce the variance and overfitting of the model and increase its generalization ability. Stratified sampling can be applied to both oversampling and undersampling methods, depending on whether the goal is to increase or decrease the size of the dataset.

The other options are not effective ways to improve the validation accuracy of the model. Acquiring additional data about the majority classes in the original dataset will only increase the imbalance and make the model more biased towards the majority classes. Using a smaller, randomly sampled version of the training dataset will not guarantee that the class distribution is preserved and may result in losing important information from the minority classes. Performing systematic sampling on the original dataset will also not ensure that the class distribution is preserved and may introduce sampling bias if the original dataset is ordered or grouped by class.

References:

* Stratified Sampling for Imbalanced Datasets

* Imbalanced Data

* Tour of Data Sampling Methods for Imbalanced Classification


Contribute your Thoughts:

0/2000 characters
Alaine
3 months ago
The presence of common failure types is a big plus for predictive maintenance!
upvoted 0 times
...
Lashon
3 months ago
Simple rule-based thresholds might still work better than ML in some cases.
upvoted 0 times
...
Levi
3 months ago
Wait, what if the data is biased towards one crane model?
upvoted 0 times
...
Aja
4 months ago
Totally agree, having detailed data is key for ML success!
upvoted 0 times
...
Christa
4 months ago
The historical sensor data from the cranes are available with high granularity for the last 3 years.
upvoted 0 times
...
Huey
4 months ago
I practiced a similar question where the amount and quality of data were key factors, so I lean towards D and E as the most promising indicators for using ML here.
upvoted 0 times
...
Reita
4 months ago
I think having high granularity in data is important, but I also recall that if we lack failure data for different crane models, it could limit the effectiveness of the ML solution.
upvoted 0 times
...
Carmelina
4 months ago
I'm not entirely sure, but I feel like if the data shows that simple rules can predict failures, it might mean ML isn't necessary. So maybe B isn't a good option?
upvoted 0 times
...
Francine
5 months ago
I remember we discussed how having a large amount of historical data is crucial for training ML models, so I think options D and E might be the right choices.
upvoted 0 times
...
Barrett
5 months ago
I'm feeling pretty confident about this one. The question is really asking us to evaluate the suitability of the data for an ML-based solution. Options D and E seem like the strongest choices based on the information provided.
upvoted 0 times
...
Renay
5 months ago
Okay, I think I've got a handle on this. The key is to identify if the historical data has enough breadth and depth to train effective ML models. Options B and E seem like the best bets here.
upvoted 0 times
...
Jennifer
5 months ago
Hmm, I'm a bit confused by the question. It's not clear to me what the "potential findings" are referring to exactly. I'll need to read through it more carefully.
upvoted 0 times
...
Malissa
5 months ago
This seems like a pretty straightforward ML scenario. The key is to look for high-quality, comprehensive historical data that can be used to train predictive models.
upvoted 0 times
...
Val
5 months ago
This is a tricky one. I'm not sure if the lack of data for certain time periods (option A) would necessarily be a dealbreaker, but the other options do seem more relevant. I'll need to think it through carefully.
upvoted 0 times
...
Tyisha
5 months ago
Okay, let's see. I know Accounts and Work Orders can have Milestones, so I'll go with those two options.
upvoted 0 times
...
Maryann
5 months ago
Hmm, I'm a bit unsure about this one. I know traffic engineering is important, but I'm not sure I can recall the specific benefits. I'll have to think it through carefully.
upvoted 0 times
...
Olive
10 months ago
Looks like the company is really 'crane'-ing to implement this ML solution.
upvoted 0 times
...
Glenn
10 months ago
I bet the cranes are so big, they have their own gravitational pull. That's why the data is 'high-granularity'!
upvoted 0 times
Giuseppe
8 months ago
C: I agree, it shows that the company has a good amount of data to train the ML model effectively.
upvoted 0 times
...
Paris
9 months ago
B: Yeah, having high granularity data for the last 3 years is crucial for accurate predictions.
upvoted 0 times
...
Viola
9 months ago
A: I think option D is a good indicator that an ML-based solution would work well.
upvoted 0 times
...
...
Kaitlyn
10 months ago
I can't believe they're actually considering A. That's like the opposite of what you want for an ML solution. Hopefully, the other candidates aren't as clueless as that.
upvoted 0 times
Shoshana
9 months ago
C) The historical sensor data contains failure data for only one type of crane model that is in operation and lacks failure data of most other types of crane that are in operation.
upvoted 0 times
...
Kanisha
9 months ago
B) The historical sensor data shows that simple rule-based thresholds can predict crane failures.
upvoted 0 times
...
Rikki
9 months ago
A) The historical sensor data does not include a significant number of data points and attributes for certain time periods.
upvoted 0 times
...
...
Annmarie
10 months ago
D is a no-brainer. 3 years of high-granularity data? Sign me up!
upvoted 0 times
Valentin
9 months ago
B) The historical sensor data shows that simple rule-based thresholds can predict crane failures.
upvoted 0 times
...
Charlene
9 months ago
E) The historical sensor data contains most common types of crane failures that the company wants to predict.
upvoted 0 times
...
Timothy
9 months ago
D) The historical sensor data from the cranes are available with high granularity for the last 3 years.
upvoted 0 times
...
...
Melissa
10 months ago
I'm torn between C and E. It's important to have failure data for all the crane models, but the common failure types are also crucial.
upvoted 0 times
...
Ricki
10 months ago
Hmm, option B seems a bit questionable. Shouldn't ML be better at predicting failures than simple rule-based thresholds?
upvoted 0 times
...
Yuki
11 months ago
I believe option D is also important. Having high granularity data for the last 3 years can help in training the ML model effectively.
upvoted 0 times
...
Nathan
11 months ago
I agree with you, Carin. If simple rules can predict failures, then ML can definitely improve on that.
upvoted 0 times
...
Carin
11 months ago
I think option B is a good indicator for using ML.
upvoted 0 times
...

Save Cancel