A company operates large cranes at a busy port. The company plans to use machine learning (ML) for predictive maintenance of the cranes to avoid unexpected breakdowns and to improve productivity.
The company already uses sensor data from each crane to monitor the health of the cranes in real time. The sensor data includes rotation speed, tension, energy consumption, vibration, pressure, and ...perature for each crane. The company contracts AWS ML experts to implement an ML solution.
Which potential findings would indicate that an ML-based solution is suitable for this scenario? (Select TWO.)
Stratified sampling is a technique that preserves the class distribution of the original dataset when creating a smaller or split dataset. This means that the proportion of examples from each class in the original dataset is maintained in the smaller or split dataset. Stratified sampling can help improve the validation accuracy of the model by ensuring that the validation dataset is representative of the original dataset and not biased towards any class. This can reduce the variance and overfitting of the model and increase its generalization ability. Stratified sampling can be applied to both oversampling and undersampling methods, depending on whether the goal is to increase or decrease the size of the dataset.
The other options are not effective ways to improve the validation accuracy of the model. Acquiring additional data about the majority classes in the original dataset will only increase the imbalance and make the model more biased towards the majority classes. Using a smaller, randomly sampled version of the training dataset will not guarantee that the class distribution is preserved and may result in losing important information from the minority classes. Performing systematic sampling on the original dataset will also not ensure that the class distribution is preserved and may introduce sampling bias if the original dataset is ordered or grouped by class.
References:
* Stratified Sampling for Imbalanced Datasets
* Imbalanced Data
* Tour of Data Sampling Methods for Imbalanced Classification
Alaine
3 months agoLashon
3 months agoLevi
3 months agoAja
4 months agoChrista
4 months agoHuey
4 months agoReita
4 months agoCarmelina
4 months agoFrancine
5 months agoBarrett
5 months agoRenay
5 months agoJennifer
5 months agoMalissa
5 months agoVal
5 months agoTyisha
5 months agoMaryann
5 months agoOlive
10 months agoGlenn
10 months agoGiuseppe
8 months agoParis
9 months agoViola
9 months agoKaitlyn
10 months agoShoshana
9 months agoKanisha
9 months agoRikki
9 months agoAnnmarie
10 months agoValentin
9 months agoCharlene
9 months agoTimothy
9 months agoMelissa
10 months agoRicki
10 months agoYuki
11 months agoNathan
11 months agoCarin
11 months ago